Contents^
Items^
Is college worth it? A return-on-investment analysis
It makes my flesh crawl to see college reduced purely to ROI, but at least they're honest that that's what they're doing.
I just hope people at least consider all of the other "outputs" of going to college when reading something like this. The ROI analysis is good data, it answers an important question. But there are many, many other important questions worth answering. (And most of them aren't quantifiable, so it's not like you could do a study on them if you wanted to.)
College can be an awful experience for some people even if they end up making good money. And it can be an excellent life experience for people who end up making dirt. I got lucky -- I had a great experience, I learned a lot of important non-academic things, I broke out of my shell, and it's pretty directly responsible for a large portion of my earnings. (And yes, the numbers for my university and major from the study match my current income and age quite well.)
But now I have a kid in high school, and I'm facing all the questions about what futures to keep open and what ones to sacrifice in the service of others. College is much more expensive now. My family was dirt poor and so I had massive financial aid; any financial aid my kid gets will have to be merit-based. Degree inflation has sucked away a lot of the value of having a degree. I can afford to support a less secure (low-risk) path through life if I think it would be better for my kid as a human being. These are not easy questions to be facing.
Increasingly I see the idea of thinking of college in any other terms as a deliberate meme designed to make people pay extremely high prices and put them in a debt trap, and that the sources of that meme like it that way.
Before college can be creating well-rounded citizenry or provide "excellent life experiences" or any of the other things people want of it, it must first provide a good ROI. This is a necessary foundation. If it is not doing that, then all the other fancy things are merely digging the debt trap in deeper because you're paying for all these things while failing to obtain a method of paying them back.
As a culture, in decades past we were used to college generally being so cheap that providing a good ROI wasn't that big a deal. It's much easier to get a good ROI out of something cheap. So we focused on the higher layers and were able to neglect the fact that the higher layers were always built on a foundation of the fact that college in general was a good ROI. Consequently we have come to misinterpret those higher level things as the purpose of college.
But it is necessary that these aspirational benefits be setting on a good foundation of good ROI to not be abusive to the customer.
It is not and must not be considered some sort of betrayal to be worried about ROI for college. It must simply be seen as an understanding of the fact that as the costs have changed, the way we must analyze college has also changed. It was always true, it's just now it has manifested in a larger way.
College wouldn’t need to provide a good roi if it were much mess expensive. Just like, strictly speaking, going to a movie doesn’t provide a good roi.
It’s one thing to take four years of your life deepening your thinking ability and stretching your mind without a good roi if you come out the other end without crushing debt and prospects for employment that can sustain you, even if those prospects have nothing to do with your college experience. This is what college was like until the last four decades or so.
It’s entirely another to come out the other side in crippling debt and no prospects for any kind of sustaining job.
If it provides the same value, but costs less, then by definition, the ROI is better. OTOH, if it puts you in crippling debt, but you have no prospects of a job to support that - the ROI becomes bad or even negative. So doesn't it capture all of that already?
In some sense, twice the cost for twice the benefit might be worth it to those who can afford it, because the time-wise investment (4 years or so of your life) remain constant.
No, it doesn't capture all of that already.
A "better" ROI doesn't necessarily mean a positive ROI.
Getting negative 10% return on a $1,000 investment doesn't matter as much as getting a negative 10% return on a $100,000 investment.
Magnitude certainly is relevant to vector comparisons; but, if we define ROI as nominal rate of return, gross returns are not relevant to a comparison by that metric.
Return on Investment: https://en.wikipedia.org/wiki/Return_on_investment
From https://en.wikipedia.org/wiki/Vector_(mathematics_and_physic... :
> A Euclidean vector is thus an equivalence class of directed segments with the same magnitude (e.g., the length of the line segment (A, B)) and same direction (e.g., the direction from A to B).[3] In physics, Euclidean vectors are used to represent physical quantities that have both magnitude and direction, but are not located at a specific place, in contrast to scalars, which have no direction.[4] For example, velocity, forces and acceleration are represented by vectors
Quantitatively and Qualitatively quantify the direct and external benefits of {college, other alternatives} with criteria in additional to real monetary ROI?
From https://en.wikipedia.org/wiki/Welfare_economics
> Welfare economics also provides the theoretical foundations for particular instruments of public economics, including cost–benefit analysis,
Except that the grandparent post that you responded to was all about the magnitude of the loss rather than just ROI. If the magnitude of the loss is manageable, then the ROI becomes less important for life decisions.
So sure! If all we are looking at is ROI then you are right! By definition! As long as you restrict your refutation ("by that metric") in a way that ignores the additional metric I had been trying to add to the conversation. I was trying to point out that there are other metrics that matter.
Direct or External Loss?
Is the unique loss you identify not accounted for in the traditional ROI expression?
[deleted]
From https://news.ycombinator.com/item?id=18833730 :
>> Why would people make an investment with insufficient ROI (Return on Investment)?
> Insufficient information.
> College Scorecard [1] is a database with a web interface for finding and comparing schools according to a number of objective criteria. CollegeScorecard launched in 2015. It lists "Average Annual Cost", "Graduation Rate", and "Salary After Attending" on the search results pages. When you review a detail page for an institution, there are many additional statistics; things like: "Typical Total Debt After Graduation" and "Typical Monthly Loan Payment".
> The raw data behind CollegeScorecard can be downloaded from [2]. The "data_dictionary" tab of the "Data Dictionary" spreadsheet describes the data schema.
> [1] https://collegescorecard.ed.gov/
> [2] https://collegescorecard.ed.gov/data/
> Khan Academy > "College, careers, and more" [3] may be a helpful supplement for funding a full-time college admissions counselor in a secondary education institution
> [3] https://www.khanacademy.org/college-careers-more
https://www.khanacademy.org/college-careers-more/college-adm... :
- [ ] Video & exercise / Jupyter notebook under Exploring college options for Return on Investment (according to e.g. CollegeScorecard data)
Notes from the Meeting on Python GIL Removal Between Python Core and Sam Gross
Just came in here briefly to opine that there is a very real risk of fork if the Python core community does not at least offer a viable alternative expediently.
The economic pressures surrounding the benefits of gross’s changes will likely influence this more than any tears shed over subtle backwards incompatibility.
I believe it was Dropbox that famously released their own private internal Python build a while back and included some concurrency patches.
Many teams might go the route of working from Sam Gross’ work and if we see subtle changes in underlying runtime concurrency semantics or something else backwards incompatible that’s it- either that adoption will roll downhill to a new standard or Python core will have to answer with a suitable GIL-less alternative.
I for one do not want to think about “ANSI Python” runtimes or give the MSFTs etc of the world an opening to divide the user base.
> I believe it was Dropbox that famously released their own private internal Python build a while back and included some concurrency patches.
Google also had their Unladen Swallow version, but it seems they lost interest at some point.
"PEP 3146 -- Merging Unladen Swallow into CPython" > Future Work (2010) https://www.python.org/dev/peps/pep-3146/#future-work
Perhaps Google/Grumpy could be updated to compile Python 3.x+ to Go with e.g. the RustPython version of the CPython Python Standard Library modules?
"Inside cpyext: Why emulating CPython C API is so Hard" (2018) https://news.ycombinator.com/item?id=18040664
Today, conda-forge compiles CPython to relocatable platform+architecture-specific binaries with LLVM. https://github.com/conda-forge/python-feedstock/blob/master/...
conda-forge also compiles PyPy Python to relocatable platform+architecture-specific binaries with LLVM. conda-forge/pypy3.6-feedstock (3.7) https://github.com/conda-forge/pypy3.6-feedstock/blob/master...
https://github.com/conda-forge/pypy-meta-feedstock/blob/mast... :
> summary: Metapackage to select pypy as python implementation
Pyodide (JupyterLite) compiles CPython to WASM (or LLVM IR?) with LLVM/emscripten IIRC. Hopefully there's a clear way to implement the new GIL-less multithreading support with Web Workers in WASM, too?
The https://rapids.ai/ org has a bunch a fast Python for HPC and Cloud; with Dask and pick a scheduler. Less process overhead and less need for interprocess locking of memory handles that transgress contexts due to a new GIL removal approach would be even faster than debuggable one process per core Python.
Grumpy is unmaintained.There is similar work (py2many) that transpiles python3 to 8 different languages.
The approach focuses on functional programming, does away with extensions completely.
For that approach to be successful, a pure python implementation of stdlib in the transpileable subset of python 3 would be super helpful.
Show HN: OtterTune – Automated Database Tuning Service for RDS MySQL/Postgres
Yo. OtterTune is a database optimization service. It uses machine learning to automatically tune your MySQL and Postgres configuration (i.e., RDS parameter groups) to improve performance and reduce costs. It does this by only looking at your database's runtime metrics (e.g., INNODB_METRICS, pg_stat_database, CloudWatch). We don't need to examine sensitive queries or user tables. We spun this project out of my research group at Carnegie Mellon University in 2020.
This week we've announced that OtterTune is now available to the public. We are offering everyone a starter account to try it out on their Postgres RDS or MySQL RDS databases (all versions, AWS US AZs only). We have seen OtterTune achieve 2-4x performance improvements and 50% cost reductions for these databases compared to using Amazon's default RDS configuration.
I am happy to answer any questions that you may have about how OtterTune works here.
-- Andy
================
More Info:
* 5min Demo Video: https://ottertune.com/blog/ottertune-explained-in-five-minutes
* Free Account Sign-up: https://ottertune.com/try
I know this Show HN is about RDS specifically, but the site suggests this works elsewhere, too; any issues with pointing OtterTune at DBs making heavy use of extensions (e.g. Timescale, Citus) or nonstandard deployment approaches (k8s, patroni, vitess)?
We have deployed OtterTune for on-prem databases (baremetal, containers). But we are not offering that to everyone right now because organizations manage their configurations for these deployments in a bunch of different ways (Github actions, Chef/Puppet, Terraform). The nice thing about RDS is that they have a single API for updating configurations (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...).
As for the Postgres-derivates that you mentioned (Timescale, Citus), we have not tested OtterTune for them yet. But there is no reason they shouldn't also work because they expose the same metrics API (pg_stat_database) and config knobs. There are just way more people using Postgres (RDS, Aurora), so we are focusing on them right now.
What about OpenStack Trove DBaaS? OpenStack Trove is like an open source self-hosted Amazon RDS or Google CloudSQL. https://docs.openstack.org/trove/latest/
FWIU, Trove supports 10+ databases including MySQL and PostgreSQL.
AFAIU, there are sound reasons to host containers with e.g. OpenStack VMs instead of a k8s scheduler with a proper SAN and just figure out how to redundantly and resiliently sync - replicate, synchronize, primary/secondary, nodeprocd - and tune given the CAP theorem and the given DB implementation(s)?
Here's the official Ansible role for Trove, which provisions various e.g. SQL databases on an OpenStack cloud: https://github.com/openstack/openstack-ansible-os_trove
Despite having just 5.8% sales, over 38% of bug reports come from Linux
Notably, of those bug reports, fewer than 1% (only 3 bugs) were specific to the Linux version of the game. That is, over 99% of the bugs reported by Linux gamers also affected players in other platforms. Moreover (quoting from the OP):
> The report quality [from Linux users] is stellar. I mean we have all seen bug reports like: “it crashes for me after a few hours”. Do you know what a developer can do with such a report? Feel sorry at best. You can’t really fix any bug unless you can replicate it, see it with your own eyes, peek inside and finally see that it’s fixed. And with bug reports from Linux players is just something else. You get all the software/os versions, all the logs, you get core dumps and you get replication steps. Sometimes I got with the player over discord and we quickly iterated a few versions with progressive fixes to isolate the problem. You just don’t get that kind of engagement from anyone else.
Are there any good articles on writing helpful bug reports? Whenever I submit a bug I try to give as much detail and be as specific as possible but I always imagine some dev somewhere reading it rolling their eyes at the unnecessary paragraphs I’ve submitted.
From "Post-surgical deaths in Scotland drop by a third, attributed to a checklist" https://westurner.github.io/hnlog/#comment-19684376 https://news.ycombinator.com/item?id=19686470 :
> GitHub and GitLab support task checklists in Markdown and also project boards [...]
> GitHub and GitLab support (multiple) Issue and Pull Request templates:
> Default: /.github/ISSUE_TEMPLATE.md || Configure in web interface
> /.github/ISSUE_TEMPLATE/Name.md || /.gitlab/issue_templates/Name.md
> Default: /.github/PULL_REQUEST_TEMPLATE.md || Configure in web interface
> /.github/PULL_REQUEST_TEMPLATE/Name.md || /.gitlab/merge_request_templates/Name.md
> There are template templates in awesome-github-templates [1] and checklist template templates in github-issue-templates [2].
Arrow DataFusion includes Ballista, which does SIMD and GPU vectorized ops
From the Ballista README:
> How does this compare to Apache Spark? Ballista implements a similar design to Apache Spark, but there are some key differences.
> - The choice of Rust as the main execution language means that memory usage is deterministic and avoids the overhead of GC pauses.
> - Ballista is designed from the ground up to use columnar data, enabling a number of efficiencies such as vectorized processing (SIMD and GPU) and efficient compression. Although Spark does have some columnar support, it is still largely row-based today.
> - The combination of Rust and Arrow provides excellent memory efficiency and memory usage can be 5x - 10x lower than Apache Spark in some cases, which means that more processing can fit on a single node, reducing the overhead of distributed compute.
> - The use of Apache Arrow as the memory model and network protocol means that data can be exchanged between executors in any programming language with minimal serialization overhead.
Previous article from when Ballista was a separate repo from arrow-datafusion: "Ballista: Distributed compute platform implemented in Rust using Apache Arrow" https://news.ycombinator.com/item?id=25824399
Parsing gigabytes of JSON per second
The project described in the paper is https://simdjson.org/.
Source: https://github.com/simdjson/simdjson
PyPI: https://pypi.org/project/pysimdjson/
There's a rust port: https://github.com/simd-lite/simd-json
... From ijson https://pypi.org/project/ijson/#id3 which supports streaming JSON:
> Ijson provides several implementations of the actual parsing in the form of backends located in ijson/backends: [yajl2_c, yajl2_cffi, yajl2, yajl, python]
Fed to ban policymakers from owning individual stocks
The relatively low salaries compared to the huge amounts of power is almost begging for insider trading, influence peddling, etc.
> The salary of a Congress member varies based on the job title of the congressman or senator. Most senators, representatives, delegates and the resident commissioner from Puerto Rico make a salary of $174,000 per year.
https://www.indeed.com/career-advice/pay-salary/congressman-...
Then again, some people have near insatiable greed and even at a $1M/year salary, some would be looking for ways to further boost their income at the boundaries of ethics, or beyond.
This is about the Fed banning it's own policymakers from owning them, not about congress.
Would the Fed actually have the authority to even do that, regulate what securities Congress members are allowed to hold?
“The Fed” as in the Federal Reserve: no.
“The Fed” as in the Federal Government: yes.
"Blind Trust" > "Use by US government officials to avoid conflicts of interest" https://en.wikipedia.org/wiki/Blind_trust
... If you want to help, you must throw all of your startup equity away.
... No, you may not co-brand with that company (which is not complicit with your agenda).
... Besides, I'm not even eligible for duty: you can't hire me.
... Maybe I could be more helpful from competitive private industry.
... How can a government hire prima donna talent like Iron Man?
... Is it criminal to start a solvent, sustainable business to solve government problems, for that one customer?
... Which operations can a government - operating with or without competition - solve most energy-efficiently and thus cost-effectively? Looks like single-payer healthcare and IDK what else?
(Edit)
US Digital Services Playbook: https://github.com/usds/playbook
From https://www.nist.gov/itl/applied-cybersecurity/nice/nice-fra... :
> "NIST Special Publication 800-181 revision 1, the Workforce Framework for Cybersecurity (NICE Framework), provides a set of building blocks for describing the tasks, knowledge, and skills that are needed to perform cybersecurity work performed by individuals and teams. Through these building blocks, the NICE Framework enables organizations to develop their workforces to perform cybersecurity work, and it helps learners to explore cybersecurity work and to engage in appropriate learning activities to develop their knowledge and skills.
From "NIST Special Publication 800-181 Revision 1: Workforce Framework for Cybersecurity (NICE Framework)" (2020) https://doi.org/10.6028/NIST.SP.800-181r1:
> 3.1 Using Existing Task, Knowledge, and Skill (TKS) Statements
(Edit) FedNOW should - like mCBDC - really consider implementing Interledger Protocol (ILP) for RTGS "Real-Time Gross Settlement" https://interledger.org/developer-tools/get-started/overview...
From https://interledger.org/rfcs/0032-peering-clearing-settlemen... :
> Peering, Clearing and Settling; The Interledger network is a graph of nodes (connectors) that have peered with one another by establishing a means of exchanging ILP packets and a means of paying one another for the successful forwarding and delivery of the packets.
Fed or no, wouldn't you think there'd be money in solving for the https://performance.gov Goals ( https://www.usaspending.gov/ ) and the #GlobalGoals (UN Sustainable Development Goals) -aligned GRI Corporate Sustainability Report? #CSR #ESG #SustyReporting
Hardened wood as a renewable alternative to steel and plastic
From "Hemp Wood: A Comprehensive Guide" https://www.buildwithrise.com/stories/hempwood-the-sustainab... :
> HempWood is priced competitively to similar cuts of black walnut. You can purchase 72" HempWood boards for between $13 and $40 as of the date of publishing. HempWood also sells carving blocks, cabinets, and kits to make your own table. Prices for table kits range from $175 to $300. Jul 5, 2021 […]
> Is Hemp Wood Healthy? Due to its organic roots and soy-based adhesive, hemp wood is naturally non-toxic and doesn't contain VOCs, making it a healthier choice for interior building.
> Hemp wood has also been tested to have a decreased likelihood of warping and twisting. Its design is free of any of the knots common in other hardwoods to reduce wood waste.
FWIU, hempcrete - hemp hurds and sustainable limestone - must be framed; possibly with Hemp Wood, which is stronger than spec lumber of the same dimensions.
FWIU, Hemp batting insulation is soaked in sodium to meet code.
Hopefully the production and distribution processes for these carbon sinks keeps net negative carbon in the black.
I want to like hempwood but the price needs to come down. Hopefully it will as production increases.
What are the limits? Input costs, current economy of scale?
Scale, reportedly. https://www.youtube.com/watch?v=3Qy6awPeric
Investors use AI to analyse CEOs’ language patterns and tone
This might be the best NewsArticle headline on HN I've ever seen.
Why, what does it say? Can you log that in a reproducible Notebook with Docs and Test assertions please?
Or are we talking about maybe a ScholarlyArticle CreativeWork with a https://schema.org/funder property or just name and url.
Graph of Keybase commits pre and post Zoom acquisition
Oh boy. Is it too early to say RIP?
The github repo looks ripe for a fork.
The servers are not FOSS and would need reimplementing.
Likely easy enough for a client based on E2E encryption principles; the backend is in many ways a (fancy) dumb pipe. (It could still require complex infrastructure, but at least there'd be relative little "feature" code on the backend to be rewritten.)
FWIU, Cyph does Open Source E2E chat, files, and unlimited length social posts to circles or to public; but doesn't yet do encrypted git repos that can be solved with something like git-crypt. https://github.com/cyph/cyph
It would be wasteful to throw away the Web of Trust (people with handles to keys) that everyone entered into Keybase. Hopefully, Zoom will consider opening up the remaining pieces of Keybase if not just spinning the product back out to a separate entity?
From https://news.ycombinator.com/item?id=19185998 https://westurner.github.io/hnlog/#comment-19185998 :
> There's also "Web Key Directory"; which hosts GPG keys over HTTPS from a .well-known URL for a given user@domain identifier: https://wiki.gnupg.org/WKD
> GPG presumes secure key distribution
> Compared to existing PGP/GPG keyservers [HKP], WKD does rely upon HTTPS.
Blockcerts can be signed when granted to a particular identity entity:
> Here are the open sources of blockchain-certificates/cert-issuer and blockchain-certificates/cert-verifier-js: https://github.com/blockchain-certificates
CT Certificate Transparency logs for key grants and revocations may depend upon a centralized or a decentralized Merkleized datastore: https://en.wikipedia.org/wiki/Certificate_Transparency
How do I specify the correct attributes of my schema.org/Person record (maybe on my JAMstack site) in order to approximate the list of identities that e.g. Keybase lets one register and refer to a cryptographic proof of?
Do I generate a W3C DID and claim my identities by listing them in a JSON-LD document signed with W3C ld-proofs (ld-signatures)? Which of the key directory and Web of Trust features of Keybase are covered by existing W3C spec Use Cases?
From https://news.ycombinator.com/item?id=28701355:
> "Use Cases and Requirements for Decentralized Identifiers" https://www.w3.org/TR/did-use-cases/
>> 2. Use Cases: Online shopper, Vehicle assemblies, Confidential Customer Engagement, Accessing Master Data of Entities, Transferable Skills Credentials, Cross-platform User-driven Sharing, Pseudonymous Work, Pseudonymity within a supply chain, Digital Permanent Resident Card, Importing retro toys, Public authority identity credentials (eIDAS), Correlation-controlled Services
> And then, IIUC W3C Verifiable Credentials / ld-proofs can be signed with W3C DID keys - that can also be generated or registered centrally, like hosted wallets or custody services. There are many Use Cases for Verifiable Credentials: https://www.w3.org/TR/vc-use-cases/ :
>> 3. User Needs: Education, Retail, Finance, Healthcare, Professional Credentials, Legal Identity, Devices
>> 4. User Tasks: Issue Claim, Assert Claim, Verify Claim, Store / Move Claim, Retrieve Claim, Revoke Claim
>> 5. Focal Use Cases: Citizenship by Parentage, Expert Dive Instructor, International Travel with Minor and Upgrade
>> 6. User Sequences: How a Verifiable Credential Might Be Created, How a Verifiable Credential Might Be Used
Is there an ACME-like thing to verify online identity control like Keybase still does?
Hopefully, Zoom will consider opening up the remaining pieces of Keybase if not just spinning the product back out to a separate entity?
> Is there an ACME-like thing to verify online identity control like Keybase still does?
From https://news.ycombinator.com/item?id=28926739 :
> NIST SP 800-63 https://pages.nist.gov/800-63-3/ :
> SP 800-63-3: Digital Identity Guidelines https://doi.org/10.6028/NIST.SP.800-63-3
> SP 800-63A: Enrollment and Identity Proofing https://doi.org/10.6028/NIST.SP.800-63a
FWIU, NIST SP 800-63A Enrollment and Identity Proofing specifies a spec sort of like ACME but for offline identity.
"Key server (cryptographic)" https://en.wikipedia.org/wiki/Key_server_(cryptographic)
> The last IETF draft for HKP also defines a distributed key server network, based on DNS SRV records: to find the key of someone@example.com, one can ask it by requesting example.com's key server.
> Keyserver examples: These are some keyservers that are often used for looking up keys with `gpg --recv-keys`.[6] These can be queried via https:// (HTTPS) or hkps:// (HKP over TLS) respectively: keys.openpgp.org , pgp.mit.edu , keyring.debian.org , keyserver.ubuntu.com ,
"Linked Data Signatures for GPG" https://gpg.jsld.org/
npm i @transmute/lds-gpg2020 -g
gpg2020 sign -u "3BCAC9A882DEFE703FD52079E9CB06E71794A713" $(pwd)/docs/example/doc.json did:btcr:xxcl-lzpq-q83a-0d5#yubikey
From https://gpg.jsld.org/contexts/#GpgSignature2020 :> GpgSignature2020: A JSON-LD Document has been signed with GpgSignature2020, when it contains a proof field with type GpgSignature2020. The proof must contain a key signatureValue with value defined by the signing algorithm described here. Example:
{
"@context": [
"https://gpg.jsld.org/contexts/lds-gpg2020-v0.0.jsonld",
{
"schema": "http://schema.org/",
"name": "schema:name",
"homepage": "schema:url",
"image": "schema:image"
}
],
"name": "Manu Sporny",
"homepage": "https://manu.sporny.org/",
"image": "https://manu.sporny.org/images/manu.png",
"proof": {
"type": "GpgSignature2020",
"created": "2020-02-16T18:21:26Z",
"verificationMethod": "did:web:did.or13.io#20a968a458342f6b1a822c5bfddb584bdf141f95",
"proofPurpose": "assertionMethod",
"signatureValue": "-----BEGIN PGP SIGNATURE-----\n\niQEzBAABCAAdFiEEIKlopFg0L2sagixb/dtYS98UH5UFAl5JiCYACgkQ/dtYS98U\nH5U8TQf/WS92hXkdkdBQ0xJcaSkoTsGspshZ+lT98N2Dqu6I1Q01VKm+UMniv5s/\n3z4VX83KuO5xtepFjs4S95S4gLmr227H7veUdlmPrQtkGpvRG0Ks5mX7tPmJo2TN\nDwm1imm+zvJ+MXr3Ld24qaRJA9dI+AoZ5HXqNp96Yncj3oWD+DtVIZmC/ZiUw43a\nLpMYy94Hie7Ad86hEoqsdRxrwq7O6KZ29TAKi5T/taemayyXY7papU28mGjVEcvO\na7M3XNBflMcMEB+g6gjrANsgFNO6tOuvOQ2+4v6yMfpJ0ji4ta7q2d4QKqGi5YhE\nsRUORN+7HJrkmSTaT7gBpFQ+YUnyLA==\n=Uzp1\n-----END PGP SIGNATURE-----\n"
}
}Single sign-on: What we learned during our identity alpha
Reading documents like passports with an NFC reader does indeed work, and does indeed produce verifiable material. Specifically the passport has proof (via a digital signature) that it was issued by a specific authority, and in turn, proof that the contents of the passport (name, date of birth, a picture and so on) are as issued.
But, the problem here is that the issuer is the British government, so, what are you proving? "Here, you issued this passport". "Oh yes, so we did". I presume the British government does own a database of the passports they issued, so this isn't news to them.
A modestly smart device, such as a Yubico device, is capable of providing fresh proof of its identity. My Security Key doesn't prove "The security key that enrolled me with GitHub in fact exists" which is redundant - but "I am still the same security key that you enrolled". However the passport can't do that, your passport is inert, and the fact that Sarah Smith existed isn't the thing you presumably want to prove to a single-sign-on service. You want to prove that you are Sarah Smith, something the passport doesn't really do.
I think the GDS ignores this problem, which is to be fair no worse than lots of other systems, but the result isn't actually what it seems to be, all the digital technology isn't actually proving anybody's identity in this space.
It reminds me of the bad old days of the Web PKI where it was found that the "email validation" being used would accept automated "virus checking" of email. A CA sends the "Are you sure you want to issue a cert for mycorp.example?" message to somebody@mycorp.example and even though Somebody is on vacation in Barbados for two weeks, the automatic "virus" check reads the URL out of the email, follows it, ignores the page saying "Success, your certificate has been issued" and passes it to Somebody's inbox... All the "security" is doing what it was designed to do, but, what it was designed to do isn't what it should have been designed to do, and so it's futile.
Isn't there two things going on here? A single sign on service, and also an identity check.
A yubico device can say "I'm still the same security key that was enrolled" but it doesn't say at enrolment "I am being used by Fred Jones of <address> with passport no 123, NiNo ABC".
GDS verify / government gateway do support 'authenticators' (typically password + totp) to provide a level of assurance that the person logging into the account is the person the account belongs to.
The "document check" is part of the identity bit. That is, how confident are we that the person creating this account / performing this action is who they say they are and can do this thing or see that information. The document check is part of their solution but other parts are supposed to layer on top of it to provide a proportionate level of assurance. e.g. a video of you holding up the passport to check that it's you using it or asking you to provide answers to details you already have like 'how much tax did you pay last year' or checking multiple documents.
> I presume the British government does own a database of the passports they issued, so this isn't news to them.
The passport office (part of the Home Office) own that database. Part of this work is basically a web service to that database for the rest of the government to use.
This is the difference between 'identity assurance' (at enrollment) and 'authentication assurance' in the relevant NIST standard, SP 800-63 https://pages.nist.gov/800-63-3/
A remote passport check might be suitable to claim an identity assurance level of 2, say, while getting to identity assurance level 3 would need an in-person visit with multiple forms of other government identification records.
In that way you might then constrain some actions to be taken remotely only by users who have provided strong assurance of their identity at enrollment (even if they login with strong non-phishable authentication... doesn't matter that the auth is ironclad if the user's identity is muddy).
Thx. NIST SP 800-63* https://pages.nist.gov/800-63-3/ :
> SP 800-63-3: Digital Identity Guidelines https://doi.org/10.6028/NIST.SP.800-63-3
> SP 800-63A: Enrollment and Identity Proofing https://doi.org/10.6028/NIST.SP.800-63a
> SP 800-63B: Authentication and Lifecycle Management https://doi.org/10.6028/NIST.SP.800-63b
> SP 800-63C: Federation and Assertions https://doi.org/10.6028/NIST.SP.800-63c
Five things we still don’t know about water
Also interesting:
https://sciencenordic.com/chemistry-climate-denmark/the-eart...
> The Earth has lost a quarter of its water
> In its early history, the Earth's oceans contained significantly more water than they do today. A new study indicates that hydrogen from split water molecules has escaped into space.
But where did the water come from? Neptune? Europa? Comet(s)? Is it just the distance to our nearest star in our habitable zone here that results in liquid water being likely?
From the article:
> But the exact mechanism for how water evaporates isn’t completely understood. The evaporation rate is traditionally represented in terms of a rate of collision between molecules, multiplied by a fudge factor called the evaporation coefficient, which varies between zero and one. Experimental determination of this coefficient, spanning several decades, has varied over three orders of magnitude.
From https://en.wikipedia.org/wiki/Evaporation :
> Evaporation is a type of vaporization that occurs on the surface of a liquid as it changes into the gas phase.[1] The surrounding gas must not be saturated with the evaporating substance. When the molecules of the liquid collide, they transfer energy to each other based on how they collide with each other. When a molecule near the surface absorbs enough energy to overcome the vapor pressure, it will escape and enter the surrounding air as a gas.[2] When evaporation occurs, the energy removed from the vaporized liquid will reduce the temperature of the liquid, resulting in evaporative cooling.[3]
New Optical Switch Up to 1000x Faster Than Transistors
Does this work at normal temperatures? Or does it need impractical cooling setups?
Show HN: I built a sonar into my surfboard
I wonder if there is some way you could pull data from the sound of the waves breaking instead of an active ping? That might be the only reasonable way to get accurate readings in that zone.
FWIU, EM backscatter can be used for e.g. gesture recognition, heartbeat detection, and metal detection. https://en.wikipedia.org/wiki/Backscatter
Entropy of wave noises may or may not be the issue.
Edit: (NASA spinoff) "Radar Device Detects Heartbeats Trapped under Wreckage" https://spinoff.nasa.gov/Spinoff2018/ps_1.html
> The Edgewood, Maryland-based company is developing a line of such remote sensing devices to aid search and rescue teams, based on advanced radar technologies developed by NASA and refined for this purpose at the Agency’s Jet Propulsion Laboratory (JPL).
> NASA has long analyzed weak radio signals to identify slight physical movements, such as seismic activity seen from low-Earth orbit or minor alterations in a satellite’s path around another planet that might indicate gravity fluctuations, explains Jim Lux, JPL’s task manager for the FINDER project. However, to pick out such faint patterns in the data, these devices must cancel out huge amounts of noise. “The core technology here is measuring a small signal in the context of another larger signal that’s confusing you,” Lux says.
(FWIW, some branches may have helicopters with infrared that they can cost over for disaster relief.)
Cortical Column Networks
Hey, Cortical Columns!
From "Jeff Hawkins Is Finally Ready to Explain His Brain Research" https://news.ycombinator.com/item?id=18214707 https://westurner.github.io/hnlog/#comment-18218504
What does (parallel) spreading activation have to do with Cortical Column Networks maybe and redundancy? https://en.wikipedia.org/wiki/Spreading_activation
I got the impression that anything transformers could be applied to, that would be an advantage. So now I want everything to use them.
Doesn't sound like these CCNs necessarily reuse a lot of information across object categories.
From https://medium.com/syncedreview/google-replaces-bert-self-at... :
> New research from a Google team proposes replacing the self-attention sublayers with simple linear transformations that “mix” input tokens to significantly speed up the transformer encoder with limited accuracy cost. Even more surprisingly, the team discovers that replacing the self-attention sublayer with a standard, unparameterized Fourier Transform achieves 92 percent of the accuracy of BERT on the GLUE benchmark, with training times that are seven times faster on GPUs and twice as fast on TPUs."
Would Transformers (with self-attention) make what things better? Maybe QFT? There are quantum chemical interactions in the brain. Are they necessary or relevant for what fidelity of emulation of a non-discrete brain?
Startup Ideas
Bring back the dumb terminal. Place mobile phone on a docking pad and see a standard PC OS on the screen - open apps, access files or use a browser. When done just grab the phone and go.
IIUC, in 2021, you can dock a PineTab or a PinePhone with a USB-C PD hub that has HDMI, USB, and Ethernet and use any of a number of Linux Desktop operating systems on a larger screen with full size keyboard and mouse.
The PineTab has a backlit keyboard and IIUC the PinePhone has a keyboard & aux battery case that doesn't yet also include the fingerprint sensor or wireless charging. https://www.pine64.org/blog/
It is easier to educate a Do-er than to motivate the educated
~ "Imagine that one could give you a copy of all of their knowledge. If you do not choose to apply and learn on your own, you can never."
This is about regimen, this is about stamina, this is about sticktoitiveness; and if you don't want it, you don't need it, you'll never. And I mean never.
The Grit article on Wikipedia mentions persistence and tenacity and stick-to-it-tiveness as roughly synonymous; and that grit may not be that distinct from other Big Five personality traits, but we're not about to listen to that, we're not going with that, because Grit is predictor of success. https://en.wikipedia.org/wiki/Grit_(personality_trait)
To the original point,
> In psychology, grit is a positive, non-cognitive trait based on an individual's perseverance of effort combined with the passion for a particular long-term goal or end state (a powerful motivation to achieve an objective). This perseverance of effort promotes the overcoming of obstacles or challenges that lie on the path to accomplishment and serves as a driving force in achievement realization. Distinct but commonly associated concepts within the field of psychology include "perseverance", "hardiness", "resilience", "ambition", "need for achievement" and "conscientiousness". These constructs can be conceptualized as individual differences related to the accomplishment of work rather than talent or ability.
Are software engineering “best practices” just developer preferences?
"The parallel he drew was to another friend who’s a Civil Engineer. His friend had to be state certified and build everything to certain codes that stand up to specific stressors and inspections.
I gave him the usual answer about how Software Engineers deal with low stakes and high iterability compared to Civil Engineers, but, honestly, he has a point."
I've argued for awhile now that Software Engineering with a big E should be licensed and regulated the same as any other Engineering discipline. Not all software is low stakes and fast changing. In fact I'd argue the most important software is never that. software for control systems, avionics, cars etc are very high stakes and have no reason to iterate beyond what is needed to interface with changing hardware. I think if software engineers had to be licensed to work on such things then those 737s wouldn't have fallen out of the sky and Tesla wouldn't be allowed to beta test self driving cars on public roads.
To those who ask what do you now call software engineers who don't work on those things, you are programmers. Or if you prefer a less formal term, coders. Engineer is a powerful word and I don't like how the we the IT industry have appropriated it for less critical tasks.
In a way there are standards for software but those standards are not expressed in software terms.
Firstly, most software is harmless. If it goes wrong people may be annoyed, but no one is harmed. But if I write software to control an aircraft, then it would have to abide by aviation standards. If I write financial software then I would have financial regulations to follow. Same for medical devices. So, there are standards for software but they are indirect and expressed in terms of the wider domain in which it runs.
Critical systems: https://en.wikipedia.org/wiki/Critical_system :
> There are four types of critical systems: safety critical, mission critical, business critical and security critical.
Safety-critical systems > "Software engineering for safety-critical systems" https://en.wikipedia.org/wiki/Safety-critical_system#Softwar... :
> By setting a standard for which a system is required to be developed under, it forces the designers to stick to the requirements. The avionics industry has succeeded in producing standard methods for producing life-critical avionics software. Similar standards exist for industry, in general, (IEC 61508) and automotive (ISO 26262), medical (IEC 62304) and nuclear (IEC 61513) industries specifically. The standard approach is to carefully code, inspect, document, test, verify and analyze the system. Another approach is to certify a production system, a compiler, and then generate the system's code from specifications. Another approach uses formal methods to generate proofs that the code meets requirements.[11] All of these approaches improve the software quality in safety-critical systems by testing or eliminating manual steps in the development process, because people make mistakes, and these mistakes are the most common cause of potential life-threatening errors.
awesome-safety-critical lists very many resources for safety critical systems: https://awesome-safety-critical.readthedocs.io/en/latest/
There are many ['Engineering'] certification programs for software and other STEM fields. One test to qualify applicants does not qualify as a sufficient set of controls for safety critical systems that must be resilient, fault-tolerant, and redundant.
A real Engineer knows that there are insufficient process controls from review of very little documentation; it's just process wisdom from experience. An engineer starts with this premise: "There are insufficient controls to do this safely" because [test scenario parameter set n] would result in the system state - the output of probably actually a complex nonlinear dynamic system - being unacceptable: outside of acceptable parameters for safe operation.
Are there [formal] Engineering methods that should be requisite to "Computer Science" degrees? What about "Applied Secure Coding Practices in [Language]"? Is that sufficient to teach theory and formal methods?
From "How We Proved the Eth2 Deposit Contract Is Free of Runtime Errors" https://news.ycombinator.com/item?id=28513922 :
>> From "Discover and Prevent Linux Kernel Zero-Day Exploit Using Formal Verification" https://news.ycombinator.com/item?id=27442273 :
>> [Coq, VST, CompCert]
>> Formal methods: https://en.wikipedia.org/wiki/Formal_methods
>> Formal specification: https://en.wikipedia.org/wiki/Formal_specification
>> Implementation of formal specification: https://en.wikipedia.org/wiki/Anti-pattern#Software_engineer...
>> Formal verification: https://en.wikipedia.org/wiki/Formal_verification
>> From "Why Don't People Use Formal Methods?" https://news.ycombinator.com/item?id=18965964 :
>>> Which universities teach formal methods?
>>> - q=formal+verification https://www.class-central.com/search?q=formal+verification
>>> - q=formal+methods https://www.class-central.com/search?q=formal+methods
>>> Is formal verification a required course or curriculum competency for any Computer Science or Software Engineering / Computer Engineering degree programs? https://news.ycombinator.com/item?id=28513922
From "Ask HN: Is it worth it to learn C in 2020?" https://news.ycombinator.com/item?id=21878372 :
> There are a number of coding guidelines e.g. for safety-critical systems where bounded running time and resource consumption are essential. These coding guidelines and standards are basically only available for C, C++, and Ada.
awesome-safety-critical > Software safety standards: https://awesome-safety-critical.readthedocs.io/en/latest/#so...
awesome-safety-critical > Coding Guidelines: https://awesome-safety-critical.readthedocs.io/en/latest/#co...
Major Quantum Computing Strategy Suffers Serious Setbacks
Inevitably, this article (and implications about quantum computing) is going to be vastly misinterpreted due to the title. It would be like saying “Major Shuttle Strategy Suffers Serious Setbacks” but the strategy is using slingshots to get us into orbit, ignoring the work of, say, NASA and SpaceX.
This quantum computing strategy neither was nor is practiced by the vast majority of quantum institutions—commercial or otherwise. It was attempted by a group at Microsoft (and a small collection of other university groups) and was known from the start that it would be a search for essentially fundamentally new observations.
Other quantum computing players, like Rigetti, Google, HRL Laboratories, IBM, Amazon, Honeywell, and others are doing an approach that is nothing like the article, and have demonstrated significant results.
It should really be "A quantum computing strategy suffers..."
Or: "One idea in quantum computing bears no fruit"
"Quantized Majorana conductance not actually observed within indium antimonide nanowires"
"Quantum qubit substrate found to be apparently insufficient" (Given the given methods and probably available resources)
And then - in an attempt to use terminology from Constructor Theory https://en.m.wikipedia.org/wiki/Constructor_theory :
> In constructor theory, a transformation or change is described as a task. A constructor is a physical entity which is able to carry out a given task repeatedly. A task is only possible if a constructor capable of carrying it out exists, otherwise it is impossible. To work with constructor theory everything is expressed in terms of tasks. The properties of information are then expressed as relationships between possible- and impossible tasks. Counterfactuals are thus fundamental statements and the properties of information may be described by physical laws.[4] If a system has a set of attributes, the set of permutations of these attributes is seen as a set of tasks. A computation medium is a system whose attributes permute to always produce a possible task. The set of permutations, and hence of tasks, is a computation set. If it is possible to copy the attributes in the computation set, the computation medium is also an information medium.
> Information, or a given task, does not rely on a specific constructor. Any suitable constructor will serve. This ability of information to be carried on different physical systems or media is described as interoperability, and arises as the principle that the combination of two information media is also an information medium.[4] Media capable of carrying out quantum computations are called superinformation media, and are characterised by specific properties. Broadly, certain copying tasks on their states are impossible tasks. This is claimed to give rise to all the known differences between quantum and classical information.[4]
"Subsequent attempts to reproduce [Quantized Majorana conductance (topological qubits of arranged electrons) within indium antimonide nanowires] eventually as a (quantum) computation medium for the given tasks failed"
"Quantum computation by Majorana zero-mode (MZM) quasiparticles in indium antimonide nanowires not actually apparently possible"
... "But what about in DDR5?" Which leads us to a more generally interesting: "Rowhammer for qubits", which is already an actual Quantum on Silicon (QoS) thing.
Attempts to scientifically “rationalize” policy may be damaging democracy
First, not having read the article:
#EvidenceBasedPolicy is a worthwhile objective even if only because the alternative is to just blow money without measuring ROI at all [because government expenditures are the actual key to feeding the beast, the economic beast, the...].
What are some examples of policy failures where Systematic review and Meta-analysis could have averted loss, harms, waste, catastrophe, long-term costs? Is that cherry picking? The other times we can just throw a dart and that's better than, ahem, these idiots we afford trying to do science?
Wouldn't it be fair to require that constituent ScholarlyArticles (and other CreativeWorks) be kept on file with e.g. the Library of Congress?
Non-federal governments usually have very similar IT and science policy review needs. Should adapting one system for non-federal governments be more complex than specifying a different String or URL in the token_name field in a transaction?
When experts review ScholarlyArticles on our behalf, they should share their structured and unstructured annotations in such a way that their cryptographically signed reviews - and highlights to identify and extract structured facts like summary statistics like sample size and IRB-reviewed study controls - become part of a team-focused collaborative systematic meta-analysis that is kept on file and regularly reviewed in regards to e.g. retractions, typical cognitive biases, failures in experimental design and implementation, and general insufficiencies that should cause us to re-evaluate our beliefs given all available information which meets our established inclusion criteria.
We have a process for peer review of PDFs - and hopefully datasets with locality for reproducibility and unitarity which purportedly helps us work through something like this sequence:
Data / Information / Knowledge / Experience / Wisdom
We often have gaps in our processes to support such progress in developing wisdom from knowledge that should be predicated upon sound information and data and then experience, bias, creeps in.
Basic principles restricting the powers of the government should prevent the government - us, we - from specifically violating the protected rights of persons; but we have allowed "Science" to cloud our judgement in application of our most basic principles of justice - i.e. Life, Liberty, and the pursuit of Happiness; and Equality and Equitability - and should we chalk the unintended consequences up to ignorance or malice?
More science all around: more Data Literacy - awareness of how many bad statistical claims are made all day around the world everywhere - is good and necessary and essential to Media Literacy, which is how we would be forming our opinions if we didn't have better tools for truth and belief for science.
"What does it mean to know?" etc.
Logic, Inference, Reasoning and Statistics probably predicated upon classical statistical mechanics are supposed to bring us closer to knowing: to bring our beliefs closer to the most widely observed truths.
Which Verifiable Claims do we trust? What studies do we admit into our personal and community meta-analyses according to our shared inclusion criteria?
"Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)" is one standard for meta-analyses, for example. http://www.prisma-statement.org/ . Could the bad guys or the dumb good guys lie with that control in place, too? Can knowing our rights - and upholding oaths to uphold values - protect us from meta-analytical group failure?
Perhaps STEM (Science, Technology, Engineering, art, and Medicine/Math) majors and other interested parties can help develop solutions for #EvidenceBasedPolicy?
This one fell flat. Maybe it was the time of day? The question should be asked every year, at least, eh? "Ask HN: Systems for supporting Evidence-Based Policy?" https://news.ycombinator.com/item?id=22920613
>> What tools and services would you recommend for evidence-based policy tasks like meta-analysis, solution criteria development, and planned evaluations according to the given criteria?
>> Are they open source? Do they work with linked open data?
> I suppose I should clarify that citizens, consumers, voters, and journalists are not acceptable answers
"#LinkedMetaAnalyses", "#StructuredPremises"; Ctrl-F "linkedmeta", "linkedrep", "#LinkedResearch": https://westurner.github.io/hnlog/
Alright, my fair biases disclosed, on to the reading the actual article: /1
Response to 'Call for Review: Decentralized Identifiers (DIDs) v1.0'
Better imho to ask: what user problem is DID solving?
Because: if it's truly decentralized then there's no need to publish it. But publication is a core aspect of DID. That it must be published is a jedi mind trick that allows in "on a blockchain". Now we see the real problem being solved (not a user problem): need something published on a blockchain.
The user problem- for those who see it that way- is that right now, digital stuff related to a person, is tied up primarily with the person's email address, or in some cases with the person's phone number.
(For the purposes of this discussion, call the email address or phone number an "identifier".)
Why is this a problem? Two reasons:
1. People don't "control" those identifiers
Many email addresses used in this context are controlled by employers, and the person's right to use them ends when they leave employment. Or they are controlled through expensive commercial arrangement between the person and the platform, and the person may lose access if they are no longer able to afford the platform. Or, they are free, offered by large data harvesting/advertising platforms, who mine data stored on the platform, and as below, linked to those identifiers from other platforms, to create advertising and propaganda targeting profiles.
2. Those identifiers are used by others to contact the person, and are therefore long-lived, which means they are also a vehicle for correlating an individual's activity across internet platforms who are necessarily presented with those identifiers by the person when they engage with the platform.
DIDs are an attempt to have identifiers that are controlled by people that:
* are inexpensive
* can be short-lived and "rotated"
* can be specific to the relationship between a person and a particular platform
* can support more tailored association of personal data to identifier
* can better support the person's management and correlation of their platform relationships, while minimizing if desired that the correlation of identifiers back to a person by the platforms themselves
* and support other use cases
In terms of "decentralization" and "publishing"- there is definitely a need to publish identifiers in some cases. People want to find others, and want to be found. Whether that publishing constitutes centralization is nuanced.But the key issue is that right now it is hard to impossible for a normal person to engage with a small or large platform that does not involve a widely used identifier.
[EDIT: whether normal users consider this to be a problem is an open question, and as is whether they would if a solution existed to the problem...]
Somebody introduces a new technology to address these concerns every couple years and it doesn't go anywhere. These aren't actually problems to a lot of users. That's the real problem that needs to be solved - awareness. And that's a lot harder than taking the identity solutions we came up with in the Identity 2.0 days and adding a blockchain.
> Somebody introduces a new technology to address these concerns every couple years and it doesn't go anywhere. These aren't actually problems to a lot of users.
"Use Cases and Requirements for Decentralized Identifiers" https://www.w3.org/TR/did-use-cases/
> 2. Use Cases: Online shopper, Vehicle assemblies, Confidential Customer Engagement, Accessing Master Data of Entities, Transferable Skills Credentials, Cross-platform User-driven Sharing, Pseudonymous Work, Pseudonymity within a supply chain, Digital Permanent Resident Card, Importing retro toys, Public authority identity credentials (eIDAS), Correlation-controlled Services
And then, IIUC W3C Verifiable Credentials / ld-proofs can be signed with W3C DID keys - that can also be generated or registered centrally, like hosted wallets or custody services. There are many Use Cases for Verifiable Credentials: https://www.w3.org/TR/vc-use-cases/ :
> 3. User Needs: Education, Retail, Finance, Healthcare, Professional Credentials, Legal Identity, Devices
> 4. User Tasks: Issue Claim, Assert Claim, Verify Claim, Store / Move Claim, Retrieve Claim, Revoke Claim
> 5. Focal Use Cases: Citizenship by Parentage, Expert Dive Instructor, International Travel with Minor and Upgrade
> 6. User Sequences: How a Verifiable Credential Might Be Created, How a Verifiable Credential Might Be Used
IIRC DHS funded some of the W3C DID and Verified Credentials specification efforts. See also: https://news.ycombinator.com/item?id=26758099
There's probably already a good way to bridge between sub-SKU GS1 schema.org/identifier on barcodes and QR codes and with DIDs. For GS1, you must register a ~namespace prefix and then you can use the rest of the available address space within the barcode or QR code IIUC.
DIDs can replace ORCIDs - which you can also just generate a new one of - for academics seeking to group their ScholarlyArticles by a better identifier than a transient university email address.
The new UUID formats may or may not be optionally useful in conjunction with W3C DID, VC, and Verifiable News, etc. https://news.ycombinator.com/item?id=28088213
When would a DID be a better choice than a UUID?
Apple didn't revolutionize power supplies; new transistors did (2012)
While obviously Jobs’ claim was false, I will say that Apple is the only company I am aware of that manufacturers power supplies which are reliably completely free of perceptible inductor whine. I have very acute high-frequency hearing and I often have to replace non-Apple USB(-C) switching power supplies with Apple ones so I don’t go crazy from the whining. Teardowns of Apple PSUs typically reveal very favorable electronic and industrial design as well.
It’s extremely frustrating, and I was always surprised when I returned mid-range USB chargers because of whine only to receive a replacement with the same problem and hundreds of reviews that failed to mention it. I’ve never had an issue with Apple chargers, and the extra cost is money we’ll spent.
Buy genuine Apple chargers, if not for you, then for your dog.
What does my engineering manager do all day?
I’ve been in those meetings, we’ve all been with different responsibility levels, we know that people are slacking at least 80% of the time and that most of those meetings could be efficiently replaced by emails.
So if your work is 90% meetings I am sorry but you have a bullshit job, nothing more, nothing less.
I kind of agree.
In my last role I ended up in a fantastic place where the teams under me could operate fairly autonomously and were happy. My only recurring meetings were:
- 1-1s fortnightly and only with senior members
- My 1-1 with my boss, again fortnightly
- Senior leadership team, once a week
I kept 1-1s to 30mins unless someone had an important issue, in which case I was completely flexible.
Most of my time went on planning/strategy, customer research, and acting as an ad-hoc coach for the teams beneath me.
The key things I found were:
- many meetings can be replaced by an update email
- decisions often work better through docs + feedback than big meetings
- you don't need frequent contact with the team if the goals and constraints are communicated very clearly
I realise none of this is groundbreaking. But it seems quite rare, and also hard to achieve in practice.
> - many meetings can be replaced by an update email
Highlights from the feed(s); GitLab has the better activity view IMHO but I haven't tried the new GitHub Issues beta yet.
3 questions from 5 minute Stand-Up Meetings (because everyone's actually standing there trying to leave) for Digital Stand Up Meetings: Since, Before, Obstacles:
## 2021-09-28
### @teammembername
#### Since
#### Before
#### Obstacles
Since: What have you done since last reporting back? Before: What do you plan to do before our next meeting? Obstacles: What needs which other team resources in order to solve the obstacles?You can do cool video backgrounds for any video conferencing app with pipewire.
You can ask team members to prep a .txt with their 3 questions and drop it in the chat such that the team can reply to individual #fragments of your brief status report / continued employment justification argument
> - decisions often work better through docs + feedback than big meetings
SO, ah, asynchronous communication doesn't require transcripting for the "Leader Assistant" that does the Mando quarterly minutes from the team chat logs, at least
6 Patterns of Collaboration: GRCOEB: Generate, Reduce, Clarify, Organize, Evaluate, Build Consensus [Six Patterns]; voting on specific Issues, and ideally Chat - [x] lineitems, and https://schema.org/SocialMediaPosting with emoji reactions
[Six Patterns]: http://wrdrd.github.io/docs/consulting/team-building#six-pat... , Text Templates, Collaboration Checklist: Weighted Criteria, Ranked-choice Voting.
Docs and posts with URLs and in-text pull-quotes do better than another list of citations at the end.
> - you don't need frequent contact with the team if the goals and constraints are communicated very clearly
Metrics: OKRs, KPIs, #GlobalGoals Goals Targets and Indicators
Tools / Methods; Data / Information / Knowledge / Experience / Wisdom:
- Issues: Title, - [ ] Description, Labels, Assignee, - [ ] Comments, Emoji Reactions;
- Pull Requests, - [ ] [Optional] [Formal] Reviews, Labels & "Codelabels", label:SkipPreflight, CI Build Logs, and Signed Deployed Documented Applications; code talks, the tests win again, docs sell
- Find and Choose - with Consensus - a sufficiently mature Component that already testably does: unified Email notifications (with inbound replies,) and notifications on each and every Chat API and the web standard thing finally, thanks: W3C Web Notifications.
- Contribute Tests for [open source] Components.
- [ ] Create a workflow document with URLs and Text Templates
- [ ] Create a daily running document with my 3 questions and headings and indented markdown checkbox lists; possibly also with todotxt/todo.txt / TaskWarrior & BugWarrior -style lineitem markup.
What does an engineering manager do all day?
A polite answer would be, continuously reevaluate the tests of the product and probably also the business model if anyone knew what they were up to in there
Using two keyboards at once for pain relief
> I tried a few of those Kinesis split keyboards. Too squishy for me. Not far enough apart. The CherryMX Kinesis split keyboard was is too clickey for calls and screenshares. Muscle memory made it difficult to switch.
This is a really cool hack, and I’m happy that the author found a solution for their pain that works for them, but this bit confused me.
Kinesis are keyboards with separated key clusters, but not split keyboards. When one says split keyboard I think they are normally talking about things like the Ergodox EZ/Moonlander which have two physically separate bodies, one for each hand. There are many different models of these with various shapes and sizes, and you can separate them as much as you like. The normal advice is to set them up around shoulder width apart so you aren’t rounding your back to bring your arms together.
Most of these kinds of keyboards also support whatever key switches you prefer, and there are plenty of options that are sufficiently quiet for zoom (pretty much anything linear should do the trick)
I have been using a Moonlander for a couple of years now, and an EZ before that. They are expensive at around $400 but I don’t think I can ever go back. Most of these split keyboards also run QMK so you can setup binds, layers, and generally configure them however you like.
It took me years to find a split, ergo-style keyboard with mechanical keys, but I finally did. The freestyles have them, but I don't like the super flat layout.
I am loving the split, angled setup with my preferred Cherry MX Blues.
The MS Natural split keyboards are easy to find but aren't satisfyingly clicky mechanical keys just like olden times.
How long do these last?
Edit: "Ergonomic keyboard" https://en.wikipedia.org/wiki/Ergonomic_keyboard > #Split_keyboard:
> Split keyboards group keys into two or more sections. Ergonomic split keyboards can be fixed, where you cannot change the positions of the sections, or adjustable. Split keyboards typically change the angle of each section, and the distance between them. On an adjustable split keyboard, this can be tailored exactly to the user. People with a broad chest will benefit from an adjustable split keyboard's ability to customize the distance between the two halves of the board. This ensures the elbows are not too close together when typing. [2]
Waydroid – Run Android containers on Ubuntu
Doubts:
- Is it better/faster/more compatible than anbox?
- Runs on ARM?
- Does it allows me to watch DRM streaming services on my linux box?
- Can I install google play on it?
Waydroid has Android use your actual Linux kernel, so on an x86-64 host you’ll run x86-64 Android, and on an ARM host, ARM Android. This means that there will be some apps that won’t run on your Intel/AMD computer. I have no idea at all how common it is for Android apps to be tied to ARM, but I imagine that ARM64 will have helped with architecture-neutrality.
> This means that there will be some apps that won’t run on your Intel/AMD computer.
Should be able to still run them if you use binfmt_misc. Of course it will be slower but it's possible.
> binfmt_misc
https://en.wikipedia.org/wiki/Binfmt_misc
> binfmt_misc can also be combined with QEMU to execute programs for other processor architectures as if they were native binaries.[9]
QEMU supported [ARM guest] machines: https://wiki.qemu.org/Documentation/Platforms/ARM#Supported_...
Edit: from "Running and Building ARM Docker Containers on x86" (which also describes how to get CUDA working) https://www.stereolabs.com/docs/docker/building-arm-containe... :
sudo apt-get install qemu binfmt-support qemu-user-static # Install the qemu packages
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes # Execute the registering scripts
docker run --rm -t arm64v8/ubuntu uname -m # Test the emulation environment
https://github.com/multiarch/qemu-user-static :> multiarch/qemu-user-static is to enable an execution of different multi-architecture containers by QEMU [1] and binfmt_misc [2]. Here are examples with Docker [3].
Why the heck isn't there just an official Android container and/or a LineageOS container?
It's not a certified device, so.
There are a number of ways to build "multi-arch docker images" e.g. for both x86 and ARM: OCI, docker build, podman build, buildx, buildah.
Containers are testable.
Here's this re: whether the official OpenWRT container should run /sbin/init in order to run procd, ubusd,: https://github.com/docker-library/official-images/pull/7975#...
AFAIU, from a termux issue thread re: repackaging everything individually, latest Android requires binaries to be installed from APKs to get the SELinux context label necessary to run?
Biologists Rethink the Logic Behind Cells’ Molecular Signals
I think (hope) that the deterministic model known as lock and key has been known to be a flawed view for quite some time. Books published in the early 00s (notably "Ni Dieu ni gène", by Kupiec and Sonigo) were already making this point in a popular science format, and explaining that even the concept of cells exchanging signals was flawed.
A cell has no evolutionary reason to transmit signals. It will however eat molecules it can use, and excrete molecules that are no longer needed, because this allows the cell to survive and reproduce. A white blood cell eating a bacteria doesn't do so with an intention to protect some organ somewhere in the body, it does so like predator eats prey. Leukocytes who eat well, ie face a bacteria they can eat, then multiply, and end up eating all the bacteria, before dying off when there's nothing more to eat. So they protect the body and then remove themselves, not because they have the intention to do it for the greater good, or because they received a signal, but because they have evolved to prey on bacteria within the ecosystem of the body.
Thinking of the body as a well ordered mechanism is a flawed view, there are no locks and keys, and most likely very few signals if any. Thinking of the body as a dynamically balanced ecosystem seems much closer to how cells behave and to the fantastically complex feedback systems that have evolved over eons and are now balancing the populations of cells in our bodies.
We may be ecosystems of individually oblivious and dumb little cells, but isn't it wondrous that from this emerges a complexity that can say "I" and has consciousness of self?
But reproduction is a fundamental aspect of evolution, and (most?) cells in the body don't reproduce on their own, but rather they are manufactured. T cells for example are manufactured by bone marrow and the thymus multiplies them if I'm understanding correctly. So I'm not sure how that fits in with the view you explained in your post.
In other words the T cell doesn't get feedback on its own fitness. The fitness feedback is at the level of the reproductive success of a human being (healthy humans can have more kids than sick ones), not the reproductive success of a T cell (because it has no reproductive capabilities and therefore cannot be subject to selection pressures). Correct me if I'm wrong.
Most cells or matter in the body?
From https://www.nature.com/articles/nature.2016.19136 :
> A 'reference man' (one who is 70 kilograms, 20–30 years old and 1.7 metres tall) contains on average about 30 trillion human cells and 39 trillion bacteria, […] Those numbers are approximate — another person might have half as many or twice as many bacteria, for example — but far from the 10:1 ratio commonly assumed.
Symbiosis https://en.wikipedia.org/wiki/Symbiosis :
> Symbiosis […] is any type of a close and long-term biological interaction between two different biological organisms, be it mutualistic, commensalistic, or parasitic. […]
> Symbiosis can be obligatory, which means that one or more of the symbionts depend on each other for survival, or facultative (optional), when they can generally live independently. […]
> Symbiosis is also classified by physical attachment. When symbionts form a single body it is called conjunctive symbiosis, while all other arrangements are called disjunctive symbiosis.[3] When one organism lives on the surface of another, such as head lice on humans, it is called ectosymbiosis; when one partner lives inside the tissues of another, such as Symbiodinium within coral, it is termed endosymbiosis.
Endosymbiont: https://en.wikipedia.org/wiki/Endosymbiont :
> Two major types of organelle in eukaryotic cells, mitochondria and plastids such as chloroplasts, are considered to be bacterial endosymbionts.[6] This process is commonly referred to as symbiogenesis.
Symbiogenesis: https://en.wikipedia.org/wiki/Symbiogenesis #Secondary_endosymbiosis ... Viral eukaryogenesis: https://en.wikipedia.org/wiki/Viral_eukaryogenesis :
> A number of precepts in the theory are possible. For instance, a helical virus with a bilipid envelope bears a distinct resemblance to a highly simplified cellular nucleus (i.e., a DNA chromosome encapsulated within a lipid membrane). In theory, a large DNA virus could take control of a bacterial or archaeal cell. Instead of replicating and destroying the host cell, it would remain within the cell, thus overcoming the tradeoff dilemma typically faced by viruses. With the virus in control of the host cell's molecular machinery, it would effectively become a functional nucleus. Through the processes of mitosis and cytokinesis, the virus would thus recruit the entire cell as a symbiont—a new way to survive and proliferate.
T-Cell # Activation: https://en.wikipedia.org/wiki/T_cell#Activation
> Both are required for production of an effective immune response; in the absence of co-stimulation, T cell receptor signalling alone results in anergy. […]
> Once a T cell has been appropriately activated (i.e. has received signal one and signal two) it alters its cell surface expression of a variety of proteins.
T-cell receptor § Signaling pathway: https://en.wikipedia.org/wiki/T-cell_receptor#Signaling_path...
Co-stimulation : https://en.wikipedia.org/wiki/Co-stimulation :
> Co-stimulation is a secondary signal which immune cells rely on to activate an immune response in the presence of an antigen-presenting cell.[1] In the case of T cells, two stimuli are required to fully activate their immune response. During the activation of lymphocytes, co-stimulation is often crucial to the development of an effective immune response. Co-stimulation is required in addition to the antigen-specific signal from their antigen receptors.
Anergy: https://en.wikipedia.org/wiki/Clonal_anergy :
> [Clonal] Anergy is a term in immunobiology that describes a lack of reaction by the body's defense mechanisms to foreign substances, and consists of a direct induction of peripheral lymphocyte tolerance. An individual in a state of anergy often indicates that the immune system is unable to mount a normal immune response against a specific antigen, usually a self-antigen. Lymphocytes are said to be anergic when they fail to respond to their specific antigen. Anergy is one of three processes that induce tolerance, modifying the immune system to prevent self-destruction (the others being clonal deletion and immunoregulation ).[1]
Clonal deletion: https://en.wikipedia.org/wiki/Clonal_deletion :
> There are millions of B and T cells inside the body, both created within the bone marrow and the latter matures in the thymus, hence the T. Each of these lymphocytes express specificity to a particular epitope, or the part of an antigen to which B cell and T cell receptors recognize and bind. There is a large diversity of epitopes recognized and, as a result, it is possible for some B and T lymphocytes to develop with the ability to recognize self.[4] B and T cells are presented with self antigen after developing receptors while they are still in the primary lymphoid organs.[3][4] Those cells that demonstrate a high affinity for this self antigen are often subsequently deleted so they cannot create progeny, which helps protect the host against autoimmunity.[2][3] Thus, the host develops a tolerance for this antigen, or a self tolerance.[3]
"DNA threads released by activated CD4+ T lymphocytes provide autocrine costimulation" (2019) https://www.pnas.org/content/116/18/8985
> A growing body of literature has shown that, aside from carrying genetic information, both nuclear and mitochondrial DNA can be released by innate immune cells and promote inflammatory responses. Here we show that when CD4+ T lymphocytes, key orchestrators of adaptive immunity, are activated, they form a complex extracellular architecture composed of oxidized threads of DNA that provide autocrine costimulatory signals to T cells. We named these DNA extrusions “T helper-released extracellular DNA” (THREDs).
FWIU, there's also a gut-brain pathway? Or is that also this "signaling method" for feedback in symbiotic complex dynamic systems?
From https://en.wikipedia.org/wiki/Complex_system :
> Complex systems are systems whose behavior is intrinsically difficult to model due to the dependencies, competitions, relationships, or other types of interactions between their parts or between a given system and its environment. Systems that are "complex" have distinct properties that arise from these relationships, such as nonlinearity, emergence, spontaneous order, adaptation, and feedback loops, among others. Because such systems appear in a wide variety of fields, the commonalities among them have become the topic of their independent area of research. In many cases, it is useful to represent such a system as a network where the nodes represent the components and links to their interactions.
Graph, Hypergraph, Property graph, Linked Data, AtomSpace, RDF* + SPARQL*, ONNX, {...}
> The term complex systems often refers to the study of complex systems, which is an approach to science that investigates how relationships between a system's parts give rise to its collective behaviors and how the system interacts and forms relationships with its environment.[1] The study of complex systems regards collective, or system-wide, behaviors as the fundamental object of study; for this reason, complex systems can be understood as an alternative paradigm to reductionism, which attempts to explain systems in terms of their constituent parts and the individual interactions between them.
A multi-digraph of probably nonlinear relations may not be the best way to describe the fields of even just a few electroweak magnets?
> As an interdisciplinary domain, complex systems draws contributions from many different fields, such as the study of self-organization and critical phenomena from physics, that of spontaneous order from the social sciences, chaos from mathematics, adaptation from biology, and many others. Complex systems is therefore often used as a broad term encompassing a research approach to problems in many diverse disciplines, including statistical physics, information theory, nonlinear dynamics, anthropology, computer science, meteorology, sociology, economics, psychology, and biology.
... Glossary of Systems Theory: https://en.wikipedia.org/wiki/Glossary_of_systems_theory
The Shunting-yard algorithm converts infix notation to RPN
RosettaCode has examples of the Shunting-Yard algorithm for parsing infix notation ((1+2)*3)^4 to an AST or just a stack of data and operators such as RPN: [ ]
Parsing/Shunting-yard algorithm: https://rosettacode.org/wiki/Parsing/Shunting-yard_algorithm
Parsing/RPN to infix conversion: https://rosettacode.org/wiki/Parsing/RPN_to_infix_conversion...
Applications: testing all combinations of operators with and without term grouping; parentheses; such as evolutionary algorithms or universal function approximaters that explore the space.
For example: https://github.com/westurner/notebooks/blob/gh-pages/maths/b... :
> This still isn't the complete set of possible solutions
[deleted]
How should logarithms be taught?
As one shape of a curve; in a notebook that demonstrates multiple methods of curve fitting with and without a logarithmic transform.
Logarithm: https://simple.wikipedia.org/wiki/Logarithm ; https://en.wikipedia.org/wiki/Logarithm :
> In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a given number x is the exponent to which another fixed number, the base b, must be raised, to produce that number x.
List of logarithmic identities: https://en.wikipedia.org/wiki/List_of_logarithmic_identities
List of integrals of logarithmic functions: https://en.wikipedia.org/wiki/List_of_integrals_of_logarithm...
As functions in a math library or a CAS that should implement the correct axioms correctly:
Sympy Docs > Functions > Contents: https://docs.sympy.org/latest/modules/functions/index.html#c...
sympy.functions.elementary.exponential. log(x, base=e) == log(x)/log(e), exp(), LambertW(), exp_polar() https://docs.sympy.org/latest/modules/functions/elementary.h...
"Exponential, Logarithmic and Trigonometric Integrals" sympy.functions.special.error_functions. Ei: exponential integral, li: logarithmic integral, Li: offset logarithmic integral https://docs.sympy.org/latest/modules/functions/special.html...
numpy.log. log() base e, log2(), log10(), log1p(x) == log(1 + x) https://numpy.org/doc/stable/reference/generated/numpy.log.h...
numpy.exp. exp(), expm1(x) == exp(x) - 1, exp2(x) == 2*x https://numpy.org/doc/stable/reference/generated/numpy.exp.h...
Khan Academy > Algebra 2 > Unit: Logarithms: https://www.khanacademy.org/math/algebra2/x2ec2f6f830c9fb89:...
Khan Academy > Algebra (all content) > Unit: Exponential & logarithmic functions https://www.khanacademy.org/math/algebra-home/alg-exp-and-lo...
3blue1brown: "Logarithm Fundamentals | Lockdown math ep. 6", "What makes the natural log "natural"? | Lockdown math ep. 7" https://www.youtube.com/playlist?list=PLZHQObOWTQDP5CVelJJ1b...
Feynmann Lectures 22-6: Algebra > Imaginary Exponents: https://www.feynmanlectures.caltech.edu/I_22.html#Ch22-S6
Power law functions: https://en.wikipedia.org/wiki/Power_law#Power-law_functions
In a two-body problem, of the 4-5 fundamental interactions: Gravity, Electroweak interaction, Strong interaction, Higgs interaction, a fifth force; which have constant exponential terms in their symbolic field descriptions? https://en.wikipedia.org/wiki/Fundamental_interaction#The_in...
Natural logs in natural systems:
Growth curve (biology) > Exponential growth: https://en.wikipedia.org/wiki/Growth_curve_(biology)#Exponen...
Basic reproduction number: https://en.wikipedia.org/wiki/Basic_reproduction_number
(... Growth hacking; awesome-grwoth-hacking: https://github.com/bekatom/awesome-growth-hacking )
Metcalf's law: https://en.wikipedia.org/wiki/Metcalfe%27s_law
Moore's law; doubling time: https://en.wikipedia.org/wiki/Moore's_law
A block reward halving is a doubling of difficulty. What block reward difficulty schedule would be a sufficient inverse of Moore's law?
A few queries:
logarithm cheatsheet https://www.google.com/search?q=logarithm+cheatsheet
logarithm on pinterest https://www.pinterest.com/search/pins/?q=logarithm
logarithm common core worksheet https://www.google.com/search?q=logarithm+common+core+worksh...
logarithm common core autograded exercise (... Khan Academy randomizes from a parametrized (?) test bank for unlimited retakes for Mastery Learning) https://www.google.com/search?q=logarithm+common+core+autogr...
If only I had started my math career with a binder of notebooks or at least 3-hole-punched notes.
- [ ] Create a git repo with an environment.yml that contains e.g. `mamba install -y jupyter-book jupytext jupyter_contrib_extensions jupyterlab-git nbdime jupyter_console pandas matplotlib sympy altair requests-html`, build a container from said repo with repo2docker, and git commit and push changes made from within the JupyterLab instance that repo2docker layers on top of your reproducible software dependency requirement specification ("REES"). {bash/zsh, git, docker, repo2docker, jupyter, [MyST] markdown and $$ mathTeX $$; Google Colab, Kaggle Kernels, ml-workspace, JupyterLite}
"How I'm able to take notes in mathematics lectures using LaTeX and Vim" https://news.ycombinator.com/item?id=19448678
Here's something like MyST Markdown or Rmarkdown for Jupyter-Book and/or jupytext:
## Log functions
Log functions in the {PyData} community
### LaTeX
#### sympy2latex
What e.g. sympy2latex parses that LaTeX into, in terms of symbolic objects in an expression tree:
### numpy
see above
### scipy
### sympy
see above
### sagemath
### statsmodels
### TensorFlow
### PyTorch
## Logarithmic and exponential computational complexity
- Docs: https://www.bigocheatsheet.com/
- [ ] DOC: Rank these with O(1) first: O(n log n), O(log n), O(1), O(n), O(n*2) +growthcurve +exponential
## Combinatorics, log, exp, and Shannon classical entropy and classical Boolean bits
https://www.google.com/search?q=formula+for+entropy :
S=k_{b}\ln\Omega
Entropy > Statistical mechanics: https://en.wikipedia.org/wiki/Entropy#Statistical_mechanicsSI unit for [ ] entropy: joules per kelvin (J*K*-1)
*****
In terms of specifying tasks for myself in order to learn {Logarithms,} I could use e.g. todo.txt markup to specify tasks with [project and concept] labels and contexts; but todo.txt doesn't support nested lists like markdown checkboxes with todo.txt markup and/or codelabels (if it's software math)
- [ ] Read the Logarithms wikipedia page <url> and take +notes +math +logarithms @workstation
- [o] Read
- [x] BLD: mathrepo: generate from cookiecutter or nbdev
- [ ] DOC: mathrepo: logarithm notes
- [ ] DOC,ART: mathrepo: create exponential and logarithmic charts +logarithms @workstation
- [ ] ENH,TST,DOC: mathrepo: logarithms with stdlib math, numpy, sympy (and *pytest* or at least `assert` assertion expressions)
- [ ] ENH,TST,DOC: mathrepo: logarithms and exponents with NN libraries (and *pytest*)
Math (and logic; ultimately thermodynamics) transcend disciplines. To bikeshed - to worry about a name that can be sed-replaced later - but choose a good variable name now,
Is 'mathrepo' the best scope for this project? Smaller dependency sets (i.e. simpler environment.yml) seem to result in less version conflicts. `conda env export --from-history; mamba env export --from-history; pip freeze; pipenv -h; poetry -h`### LaTeX
$$ \log_{b} x = (b^? = x) $$
$$ 2^3 = 8 $$
$$ \log_{2} 8 = 3 $$
$$ \ln e = 1 $$
$$ \log_b(xy)=\log_b(x)+\log_b(y) $$
$ \begin{align}
\textit{(1) } \log_b(xy) & = \log_b(x)+\log_b(y)
\end{align} $
Sources: https://en.wikipedia.org/w/index.php?title=List_of_logarithm... ,#### sympy2latex
What e.g. sympy2latex parses that LaTeX into, in terms of symbolic objects in an expression tree:
# install
#!python -m pip install antlr4-python3-runtime sympy
#!mamba install -y -q antlr-python-runtime sympy
import sympy
from sympy.parsing.latex import parse_latex
def displaylatexexpr(latex):
expr = parse_latex(latex)
display(str(expr))
display(expr)
return expr
displaylatexexpr('\log_{2} 8'))
# 'log(8, 2)'
displaylatexexpr('\log_{2} 8 = 3'))
# 'Eq(log(8, 2), 3)'
displaylatexexpr('\log_b(xy) = \log_b(x)+\log_b(y)'))
# 'Eq(log(x*y, b), log(x, b) + log(y, b))'
displaylatexexpr('\log_{b} (xy) = \log_{b}(x)+\log_{b}(y)')
# 'Eq(log(x*y, b), log(x, b) + log(y, b))'
displaylatexexpr('\log_{2} (xy) = \log_{2}(x)+\log_{2}(y)')
# 'Eq(log(x*y, 2), log(x, 2) + log(y, 2))'
### python standard libraryhttps://docs.python.org/3/library/operator.html#operator.pow
https://docs.python.org/3/library/math.html#power-and-logari...
math. exp(x), expm1(), log(x, base=e), log1p(x), log2(x), log10(x), pow(x, y) : float, assert sqrt() == pow(x, 1/2)
## scipy
https://docs.scipy.org/doc/scipy/reference/generated/scipy.s... scipy.special. xlog1py()
https://docs.scipy.org/doc/scipy/reference/generated/scipy.s...
### sagemath
https://doc.sagemath.org/html/en/reference/functions/sage/fu...
### statsmodels
### TensorFlow https://www.tensorflow.org/api_docs/python/tf/math tf.math. log(), log1P(), log_sigmoid(), exp(), expm1()
https://keras.io/api/layers/activations/
SmoothReLU ("softplus") adds ln to the ReLU activation function, for example: https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#So...
E.g. Softmax & LogSumExp also include natural logarithms in their definitions: https://en.wikipedia.org/wiki/Softmax_function
### PyTorch
https://pytorch.org/docs/stable/generated/torch.log.html torch. log(), log10(), log1p(), log2(), exp(), exp2(), expm1(); logaddexp() , logaddexp2(), logsumexp(), torch.special.xlog1py()
***
Regarding this learning process and these tools, Now I have a few replies to myself (!) in not-quite-markdown and with various headings: I should consolidate this information into a [MyST] markdown Jupyter Notebook and re-lead the whole thing. If this was decent markdown from the start, I'd have less markup work to do to create a ScholarlyArticle / Notebook.
Automatic cipher suite ordering in Go’s crypto/tls
Summary: The Golang team is deciding what ranked order TLS cipher suites should be used in. You are not able to decide what cipher suites to use, the Golang team sets that in the code and will update it as they see fit.
My take on this is that Filippo is taking a heavy handed approach here. This works for the majority of "dev write code fast ship it over the wall" scenarios. But it also falls apart in a couple scenarios:
* Companies/govt agencies which mandate the use of certain algorithms, such as RSA, and both sides of the connection are running Golang code. Now you need to vendor the TLS library to allow for what the company wants.
* If an issue is discovered with an algorithm it would be really nice to be able to turn off the algorithm in production until the issue can be patched. I'm not sure if this is possible with the current set of changes.
Any government agency that mandates use of specific TLS algorithms (like whitelist) almost certainly requires FIPS cryptography (or classified cryptography), so you won’t be using crypto/TLS anyway.
As a security/cryptography engineer, I love this change: for cryptography, it’s clear that more knobs == more problems.
As a developer, however, I dislike it: more knobs it’s easier to get the code to do what I need it to do.
Personally, I think it’s a good choice by Filippo.
From "Go Crypto and Kubernetes — FIPS 140–2 and FedRAMP Compliance" (2021) https://gokulchandrapr.medium.com/go-crypto-and-kubernetes-f... :
> If a vendor wants to supply cloud-based services to the US Federal Government, then they have to get FedRAMP approval. This certification process covers a whole host of security issues, but is very specific about its requirements on cryptography: usage of FIPS 140–2 validated modules wherever cryptography is needed, these encryption standards protect the cryptographic module from being cracked, altered, or otherwise tampered with. FIPS 140–2 validated encryption is a prerequisite for FedRAMP. [...]
> [...] Go Cryptography and Kubernetes — FIPS 140–2 Kubernetes is a Go project, as are most of the Kubernetes subcomponents and ecosystem. Golang has a crypto standard library, Golang Crypto which fulfills almost all the application crypto needs (TLS stack implementation for HTTPS servers and clients all the way to HMAC or any other primitive that are needed to make signatures to verify hashes, encrypt messages.). Go has made a different choice compared to most languages, which usually come with links or wrappers for OpenSSL or simply don’t provide any cryptography in the standard library (Rust doesn’t have standard library cryptography, JavaScript only has web crypto, Python doesn’t come with a crypto standard library). [...]
> The native go crypto is not FIPS compliant and there are few open proposals to facilitate Go code to meet FIPS requirements. Users can use prominent go compilers/toolsets backed by FIPS validated SSL libraries provided by Google or Redhat which enables Go to bypass the standard library cryptographic routines and instead call into a FIPS 140–2 validated cryptographic library. These toolsets are available as container images, where users can use the same to compile any Go based applications. [...]
> When a RHEL system is booted in FIPS mode, Go will instead call into OpenSSL via a new package that bridges between Go and OpenSSL. This also can be manually enabled by setting `GOLANG_FIPS=1`. The Go Toolset is available as a container image that can be downloaded from Red Hat Container Registry. Red Hat mentions that this as a new feature built on top of existing upstream work (BoringSSL). [...]
> To be FIPS 140–2 compliant, the module must use FIPS 140–2 complaint algorithms, ciphers, key establishment methods, and other protection profiles.
> FIPS-approved algorithms do change at times; not extremely frequently, but more often than they come out with a new version of FIPS 140. [...]
> Some of the fundamental requirements (not limited to) are as follows:
> [...] Support for TLS 1.0 and TLS 1.1 is now deprecated (only allowed in certain cases). TLS 1.3 is the preferred option, while TLS 1.2 is only tolerated.
> [...] DSA/RSA/ECDSA are only approved for key generation/signature.
> [...] The 0-RTT option in TLS 1.3 should be avoided.
Was there lag between the release of TLS 1.3 and an updated release of FIPS 140? @18f @DefenseDigital Can those systems be upgraded as easily?
NIST just moves slowly is all.
Common Criteria (NIAP) still only allows TLSv1.2 and TLSv1.1.
Scikit-Learn Version 1.0
There are scikit-learn (sklearn) API-compatible wrappers for e.g. PyTorch and TensorFlow.
Skorch: https://github.com/skorch-dev/skorch
tf.keras.wrappers.scikit_learn: https://www.tensorflow.org/api_docs/python/tf/keras/wrappers...
AFAIU, there are not Yellowbrick visualizers for PyTorch or TensorFlow; though PyTorch abd TensorFlow work with TensorBoard for visualizing CFG execution.
> Many machine learning libraries implement the scikit-learn `estimator API` to easily integrate alternative optimization or decision methods into a data science workflow. Because of this, it seems like it should be simple to drop in a non-scikit-learn estimator into a Yellowbrick visualizer, and in principle, it is. However, the reality is a bit more complicated.
> Yellowbrick visualizers often utilize more than just the method interface of estimators (e.g. `fit()` and `predict()`), relying on the learned attributes (object properties with a single underscore suffix, e.g. `coef_`). The issue is that when a third-party estimator does not expose these attributes, truly gnarly exceptions and tracebacks occur. Yellowbrick is meant to aid machine learning diagnostics reasoning, therefore instead of just allowing drop-in functionality that may cause confusion, we’ve created a wrapper functionality that is a bit kinder with it’s messaging.
Looks like there are Yellowbrick wrappers for XGBoost, CatBoost, CuML, and Spark MLib; but not for NNs yet. https://www.scikit-yb.org/en/latest/api/contrib/wrapper.html...
From the RAPIDS.ai CuML team: https://docs.rapids.ai/api/cuml/stable/ :
> cuML is a suite of fast, GPU-accelerated machine learning algorithms designed for data science and analytical tasks. Our API mirrors Sklearn’s, and we provide practitioners with the easy fit-predict-transform paradigm without ever having to program on a GPU.
> As data gets larger, algorithms running on a CPU becomes slow and cumbersome. RAPIDS provides users a streamlined approach where data is intially loaded in the GPU, and compute tasks can be performed on it directly.
CuML is not an NN library; but there are likely performance optimizations from CuDF and CuML that would accelerate performance of NNs as well.
Dask ML works with models with sklearn interfaces, XGBoost, LightGBM, PyTorch, and TensorFlow: https://ml.dask.org/ :
> Scikit-Learn API
> In all cases Dask-ML endeavors to provide a single unified interface around the familiar NumPy, Pandas, and Scikit-Learn APIs. Users familiar with Scikit-Learn should feel at home with Dask-ML.
dask-labextension for JupyterLab helps to visualize Dask ML CFGs which call predictors and classifiers with sklearn interfaces: https://github.com/dask/dask-labextension
well, this is realy the reason i asked my question.
It is a pain to move in and out of scikit and write all these wrappers and converters. I would prefer to do more in one framework.
For example, doing hyperparameter optimization in pytorch using scikit can be a bit painful sometimes
Ctrl-F automl https://westurner.github.io/hnlog/
> /? hierarchical automl "sklearn" site:github.com : https://www.google.com/search?q=hierarchical+automl+%22sklea...
https://westurner.github.io/hnlog/#comment-18798244
> Dask-ML works with {scikit-learn, xgboost, tensorflow, TPOT,}. ETL is your responsibility. Loading things into parquet format affords a lot of flexibility in terms of (non-SQL) datastores or just efficiently packed files on disk that need to be paged into/over in RAM. (Edit)
scale-scikit-learn https://examples.dask.org/machine-learning/scale-scikit-lear... -> dask.distributed parallel predication: https://examples.dask.org/machine-learning/parallel-predicti...
"Hyperparameter optimization with Dask" https://examples.dask.org/machine-learning/hyperparam-opt.ht...
> Sklearn.pipeline.Pipeline API: {fit(), transform(), predict(), score(),} https://scikit-learn.org/stable/modules/generated/sklearn.pi... : ```
decision_function(X) # Apply transforms, and decision_function of the final estimator
fit(X[, y]) # Fit the model
fit_predict(X[, y]) # Applies fit_predict of last step in pipeline after transforms.
fit_transform(X[, y]) # Fit the model and transform with the final estimator
get_params([deep]) # Get parameters for this estimator.
predict(X, *predict_params) # Apply transforms to the data, and predict with the final estimator
predict_log_proba(X) # Apply transforms, and predict_log_proba of the final estimator
predict_proba(X) # Apply transforms, and predict_proba of the final estimator
score(X[, y, sample_weight]) # Apply transforms, and score with the final estimator
score_samples(X) # Apply transforms, and score_samples of the final estimator.
set_params(**kwargs) # Set the parameters of this estimator
```
> https://docs.featuretools.com can also minimize ad-hoc boilerplate ETL / feature engineering :
>> Featuretools is a framework to perform automated feature engineering. It excels at transforming temporal and relational datasets into feature matrices for machine learning
From https://featuretools.alteryx.com/en/stable/guides/using_dask... :
> Creating a feature matrix from a very large dataset can be problematic if the underlying pandas dataframes that make up the entities cannot easily fit in memory. To help get around this issue, Featuretools supports creating Entity and EntitySet objects from Dask dataframes. A Dask EntitySet can then be passed to featuretools.dfs or featuretools.calculate_feature_matrix to create a feature matrix, which will be returned as a Dask dataframe. In addition to working on larger than memory datasets, this approach also allows users to take advantage of the parallel and distributed processing capabilities offered by Dask
Signed Exchanges on Google Search
From https://blog.cloudflare.com/automatic-signed-exchanges/ :
> The broader implication of SXGs is that they make content portable: content delivered via an SXG can be easily distributed by third parties while maintaining full assurance and attribution of its origin. Historically, the only way for a site to use a third party to distribute its content while maintaining attribution has been for the site to share its SSL certificates with the distributor. This has security drawbacks. Moreover, it is a far stretch from making content truly portable.
> In the long-term, truly portable content can be used to achieve use cases like fully offline experiences. In the immediate term, the primary use case of SXGs is the delivery of faster user experiences by providing content in an easily cacheable format. Specifically, Google Search will cache and sometimes prefetch SXGs. For sites that receive a large portion of their traffic from Google Search, SXGs can be an important tool for delivering faster page loads to users.
> It’s also possible that all sites could eventually support this standard. Every time a site is loaded, all the linked articles could be pre-loaded. Web speeds across the board would be dramatically increased.
"Signed HTTP Exchanges" draft-yasskin-http-origin-signed-responses https://wicg.github.io/webpackage/draft-yasskin-http-origin-...
"Bundled HTTP Exchanges" draft-yasskin-wpack-bundled-exchanges https://wicg.github.io/webpackage/draft-yasskin-wpack-bundle... :
> Web bundles provide a way to bundle up groups of HTTP responses, with the request URLs and content negotiation that produced them, to transmit or store together. They can include multiple top-level resources with one identified as the default by a primaryUrl metadata, provide random access to their component exchanges, and efficiently store 8-bit resources.
From https://web.dev/web-bundles/ :
> Introducing the Web Bundles API. A Web Bundle is a file format for encapsulating one or more HTTP resources in a single file. It can include one or more HTML files, JavaScript files, images, or stylesheets.
> Web Bundles, more formally known as Bundled HTTP Exchanges, are part of the Web Packaging proposal.
> HTTP resources in a Web Bundle are indexed by request URLs, and can optionally come with signatures that vouch for the resources. Signatures allow browsers to understand and verify where each resource came from, and treats each as coming from its true origin. This is similar to how Signed HTTP Exchanges, a feature for signing a single HTTP resource, are handled.
AlphaGo documentary (2020) [video]
I'm curious to ask what's next for this series of AI at Deepmind? Is there other more challenging problems they are tackling at the moment? I read they have already master StarCraft 2 even.
Is it the case that they have stopped this series of AI and going all in on protein folding at the moment?
This blog post mentions testing on Atari and being applied to chemistry and quantum physics
https://deepmind.com/blog/article/muzero-mastering-go-chess-...
AlphaFold 2 solved the CASP protein folding problem that AFAIU e.g. Folding@home et. al have been churning at for awhile FWIU. From November 2020: https://deepmind.com/blog/article/alphafold-a-solution-to-a-...
https://en.wikipedia.org/wiki/AlphaFold#SARS-CoV-2 :
> AlphaFold has been used to a predict structures of proteins of SARS-CoV-2, the causative agent of COVID-19 [...] The team acknowledged that though these protein structures might not be the subject of ongoing therapeutical research efforts, they will add to the community's understanding of the SARS-CoV-2 virus.[74] Specifically, AlphaFold 2's prediction of the structure of the ORF3a protein was very similar to the structure determined by researchers at University of California, Berkeley using cryo-electron microscopy. This specific protein is believed to assist the virus in breaking out of the host cell once it replicates. This protein is also believed to play a role in triggering the inflammatory response to the infection (... Berkeley ALS and SLAC beamlines ... S309 & Sotrovimab: https://scitechdaily.com/inescapable-covid-19-antibody-disco... )
Is there yet an open implementation of AlphaFold 2? edit: https://github.com/search?q=alphafold ... https://github.com/deepmind/alphafold
How do I reframe this problem in terms of fundamental algorithmic complexity classes (and thus the Quantum Algorithm Zoo thing that might optimize the currently fundamentally algorithmically computationally hard part of the hot loop that is the cost driver in this implementation)?
To cite in full from the MuZero blog post from December 2020: https://deepmind.com/blog/article/muzero-mastering-go-chess-... :
> Researchers have tried to tackle this major challenge in AI by using two main approaches: lookahead search or model-based planning.
> Systems that use lookahead search, such as AlphaZero, have achieved remarkable success in classic games such as checkers, chess and poker, but rely on being given knowledge of their environment’s dynamics, such as the rules of the game or an accurate simulator. This makes it difficult to apply them to messy real world problems, which are typically complex and hard to distill into simple rules.
> Model-based systems aim to address this issue by learning an accurate model of an environment’s dynamics, and then using it to plan. However, the complexity of modelling every aspect of an environment has meant these algorithms are unable to compete in visually rich domains, such as Atari. Until now, the best results on Atari are from model-free systems, such as DQN, R2D2 and Agent57. As the name suggests, model-free algorithms do not use a learned model and instead estimate what is the best action to take next.
> MuZero uses a different approach to overcome the limitations of previous approaches. Instead of trying to model the entire environment, MuZero just models aspects that are important to the agent’s decision-making process. After all, knowing an umbrella will keep you dry is more useful to know than modelling the pattern of raindrops in the air.
> Specifically, MuZero models three elements of the environment that are critical to planning:
> * The value: how good is the current position?
> * The policy: which action is the best to take?
> * The reward: how good was the last action?
> These are all learned using a deep neural network and are all that is needed for MuZero to understand what happens when it takes a certain action and to plan accordingly.
> Illustration of how Monte Carlo Tree Search can be used to plan with the MuZero neural networks. Starting at the current position in the game (schematic Go board at the top of the animation), MuZero uses the representation function (h) to map from the observation to an embedding used by the neural network (s0). Using the dynamics function (g) and the prediction function (f), MuZero can then consider possible future sequences of actions (a), and choose the best action.
> MuZero uses the experience it collects when interacting with the environment to train its neural network. This experience includes both observations and rewards from the environment, as well as the results of searches performed when deciding on the best action.
> During training, the model is unrolled alongside the collected experience, at each step predicting the previously saved information: the value function v predicts the sum of observed rewards (u), the policy estimate (p) predicts the previous search outcome (π), the reward estimate r predicts the last observed reward (u). This approach comes with another major benefit: MuZero can repeatedly use its learned model to improve its planning, rather than collecting new data from the environment. For example, in tests on the Atari suite, this variant - known as MuZero Reanalyze - used the learned model 90% of the time to re-plan what should have been done in past episodes.
FWIU, from what's going on over there:
AlphaGo => AlphaGo {Fan, Lee, Master, Zero} => AlphaGoZero => AlphaZero => MuZero
AlphaGo: https://en.wikipedia.org/wiki/AlphaGo_Zero
AlphaZero: https://en.wikipedia.org/wiki/AlphaZero
MuZero: https://en.wikipedia.org/wiki/MuZero
AlphaFold {1,2}: https://en.wikipedia.org/wiki/AlphaFold
IIRC, there is not an official implementation of e.g. AlphaZero or MuZero with e.g. openai/gym (and openai/retro) for comparing reinforcement learning algorithms? https://github.com/openai/gym
What are the benchmarks for Applied RL?
From https://news.ycombinator.com/item?id=28499001 :
> AFAIU, while there are DLTs that cost CPU, RAM, and Data storage between points in spacetime, none yet incentivize energy efficiency by varying costs depending upon whether the instructions execute on a FPGA, ASIC, CPU, GPU, TPU, or QPU? [...]
> To be 200% green - to put a 200% green footer with search-discoverable RDFa on your site - I think you need PPAs and all directly sourced clean energy.
> (Energy efficiency is very relevant to ML/AI/AGI, because while it may be the case that the dumb universal function approximator will eventually find a better solution, "just leave it on all night/month/K12+postdoc" in parallel is a very expensive proposition with no apparent oracle; and then to ethically filter solutions still costs at least one human)
I'm seeing a lot of cool stuff that people start to build on AlphaFold lately:
- ChimeraX: https://www.youtube.com/watch?v=le7NatFo8vI
Libraries.io indexes software dependencies; but no Dependent packages or Dependent repositories are yet listed for the pypi:alphafold package: https://libraries.io/pypi/alphafold
The GitHub network/dependents view currently lists one repo that depends upon deepmind/alphafold: https://github.com/deepmind/alphafold/network/dependents
(Linked citations for science: How to cite a schema:SoftwareApplication in a schema:ScholarlyArticle , How to cite a software dependency in a dependency specification parsed by e.g. Libraries.io and/or GitHub. e.g. FigShare and Zenodo offer DOIs for tags of git repos, that work with BinderHub and repo2docker and hopefully someday repo2jupyterlite. https://westurner.github.io/hnlog/#comment-24513808 )
/?gscholar alphafold: https://scholar.google.com/scholar?q=alphafold
On a Google Scholar search result page, you can click "Cited by [ ]" to check which documents contain textual and/or URL citations gscholar has parsed and identified as indicating a relation to a given ScholarlyArticle.
/?sscholar alphafold: https://www.semanticscholar.org/search?q=alphafold
On a Semantic Scholar search result page, you can click the "“" to check which documents contain textual and/or URL citations Semantic Scholar has parsed and identified as indicating a relation to a given ScholarlyArticle.
/?smeta alphafold: https://www.meta.org/search?q=t---alphafold
On a Meta.org search result page, you can click the article title and scroll down to "Citations" to check which documents contain textual and/or URL citations Meta has parsed and identified as indicating a relation to a given ScholarlyArticle.
Do any of these use structured data like https://schema.org/ScholarlyArticle ? (... https://westurner.github.io/hnlog/#comment-28495597 )
Interpretable Model-Based Hierarchical RL Using Inductive Logic Programming
I don't work in the field, but I sort of passively follow it.
A year ago I made this comment, in another ML thread:
https://news.ycombinator.com/item?id=23315739
"I often wonder about whether neural networks might need to meet at a crossroads with other techniques."
"Inductive Logic/Answer Set Programming or Constraints Programming seems like it could be a good match for this field. Because from my ignorant understanding, you have a more "concrete" representation of a model/problem in the form of symbolic logic or constraints and an entirely abstract "black box" solver with neural networks. I have no real clue, but it seems like they could be synergistic?"
I can't interpret the paper -- is this roughly in this vein?I've been thinking along the same lines, it seems like logic + ML would complement each other well. Acquiring trustworthy labeled data is "THE" problem in ML, and figuring out which predicates to string together is "THE" problem in logic programming, seems like a perfect match.
A logic program can produce a practically infinite number of perfectly consistent test cases for the ML model to learn from, and the ML model can predict which problem should be solved. I'd like to see a conversational interface that combines these two systems, ML generates logic statements and observes the results, repeat. That might help to keep it from going off the rails like a long GPT-3 session tends to do.
AutoML is RL? The entire exercise of publishing and peer review is an exercise in cybernetics?
https://en.wikipedia.org/wiki/Probabilistic_logic_network :
> The basic goal of PLN is to provide reasonably accurate probabilistic inference in a way that is compatible with both term logic and predicate logic, and scales up to operate in real time on large dynamic knowledge bases.
> The goal underlying the theoretical development of PLN has been the creation of practical software systems carrying out complex, useful inferences based on uncertain knowledge and drawing uncertain conclusions. PLN has been designed to allow basic probabilistic inference to interact with other kinds of inference such as intensional inference, fuzzy inference, and higher-order inference using quantifiers, variables, and combinators, and be a more convenient approach than Bayesian networks (or other conventional approaches) for the purpose of interfacing basic probabilistic inference with these other sorts of inference. In addition, the inference rules are formulated in such a way as to avoid the paradoxes of Dempster–Shafer theory.
Has anybody already taught / reinforced an OpenCog [PLN, MOSES] AtomSpace hypergraph agent to do Linked Data prep and also convex optimization with AutoML and better than grid search so gradients?
Perhaps teaching users to bias analyses with e.g. Yellowbrick and the sklearn APIs would be a good curriculum traversal?
opening/baselines "Logging and vizualizing learning curves and other training metrics" https://github.com/openai/baselines#logging-and-vizualizing-...
https://en.wikipedia.org/wiki/AlphaZero
There's probably an awesome-automl by now? Again, the sklearn interfaces.
TIL that SymPy supports NumPy, PyTorch, and TensorFlow [Quantum; TFQ?]; and with a Computer Algebra System something for mutating the AST may not be necessary for symbolic expression trees without human-readable comments or symbol names? Lean mathlib: https://github.com/leanprover-community/mathlib , and then reasoning about concurrent / distributed systems (with side channels in actual physical component space) with e.g. TLA+.
There are new UUID formats that are timestamp-sortable; for when blockchain cryptographic hashes aren't enough entropy. "New UUID Formats – IETF Draft" https://news.ycombinator.com/item?id=28088213
... You can host online ML algos through SingularityNet, which also does PayPal now for the RL.
Our visual / auditory biological neural networks do appear to be hierarchical and relatively highly plastic as well.
If you're planning to mutate, crossover, and select expression trees, you'll need a survival function (~cost function) in order to reinforce; RL.
Blockchains cost immutable data storage with data integrity protections by the byte.
Smart contracts cost CPU usage with costed opcodes. eWASM (Ethereum WebAssembly) has costed opcodes for redundantly-executed smart contracts (that execute on n nodes of a shard) https://ewasm.readthedocs.io/en/mkdocs/determining_wasm_gas_...
AFAIU, while there are DLTs that cost CPU, RAM, and Data storage between points in spacetime, none yet incentivize energy efficiency by varying costs depending upon whether the instructions execute on a FPGA, ASIC, CPU, GPU, TPU, or QPU?
To be 200% green - to put a 200% green footer with search-discoverable RDFa on your site - I think you need PPAs and all directly sourced clean energy.
(Energy efficiency is very relevant to ML/AI/AGI, because while it may be the case that the dumb universal function approximator will eventually find a better solution, "just leave it on all night/month/K12+postdoc" in parallel is a very expensive proposition with no apparent oracle; and then to ethically filter solutions still costs at least one human)
> Perhaps teaching users to bias analyses with e.g. Yellowbrick and the sklearn APIs would be a good curriculum traversal?
Yellowbrick > Third Party Estimaters: (yellowbrick.contrib.wrapper: https://www.scikit-yb.org/en/latest/api/contrib/wrapper.html
From https://www.scikit-yb.org/en/latest/quickstart.html#using-ye... :
> The Yellowbrick API is specifically designed to play nicely with scikit-learn. The primary interface is therefore a Visualizer – an object that learns from data to produce a visualization. Visualizers are scikit-learn Estimator objects and have a similar interface along with methods for drawing. In order to use visualizers, you simply use the same workflow as with a scikit-learn model, import the visualizer, instantiate it, call the visualizer’s fit() method, then in order to render the visualization, call the visualizer’s show() method.
> For example, there are several visualizers that act as transformers, used to perform feature analysis prior to fitting a model. The following example visualizes a high-dimensional data set with parallel coordinates:
from yellowbrick.features import ParallelCoordinates
visualizer = ParallelCoordinates()
visualizer.fit_transform(X, y)
visualizer.show()
> As you can see, the workflow is very similar to using a scikit-learn transformer, and visualizers are intended to be integrated along with scikit-learn utilities. Arguments that change how the visualization is drawn can be passed into the visualizer upon instantiation, similarly to how hyperparameters are included with scikit-learn models.IIRC, some automl tools - which test various combinations of, stacks of, ensembles of e.g. Estimators - do test hierarchical ensembles? Are those 'piecewise' and ultimately not the unified theory we were looking for here either (but often a good enough, fast enough, sufficient approximate solution with a sufficiently low error term)?
/? hierarchical automl "sklearn" site:github.com : https://www.google.com/search?q=hierarchical+automl+%22sklea...
Ship / Show / Ask: A modern branching strategy
> The reason you’re reliant on a lot of “Asking” might be that you have trust issue. “All changes must be approved” or “Every pull request needs 2 reviewers” are common policies, but they show a lack of trust in the development team.
I'm not following. We do reviews because a second pair of eyes can spot things the author missed.
I've been labelled controversial (amongst other things..) in the past for stating this but hey ho, I'll take another leap:
Aside from the "small changes, fast fixes" type of mantra, and "pairing is better than reviewing" that I suspect Martin Fowler is leaning on (and that I both support strongly):
There is a difference between having reviews, and enforcing reviews.
In my multi-decade career I'm yet to have a signle instance of the thought "thank god we prohibit changes that haven't been reviewed by two other people"
But I have lost count the number of times I've had the thought "I need to bypass this policy because shit's on fire and I've got a fix"
Then comes the question of quality of review when they are mandated.
I'll interject with: Code reviews are no inherently bad. Feedback is good. Collaboration is good. There are better methods of providing feedback than code reviews (I strongly object to the post-facto nature of code reviews)
Mandated peer reviews less so. Feedback is hurried, if not entirely absent, as it becomes another task that people _have_ to do. The time spent reviewing is rarely accounted for. There's a lot of knowledge transfer required for any non-trivial review to be effective, further increasing the demand on the participants.
Code reviews are another tool in the box, but they should not be used for every job.
Where I currently work, we have "skip review" and "skip preflight" labels for this. The mergers have the power to merge anything anyway, the labels are only to make it an official request.
> Where I currently work, we have "skip review" and "skip preflight" labels for this. The mergers have the power to merge anything anyway, the labels are only to make it an official request.
From the OP:
> Changes are categorized as either Ship (merge into mainline without review), Show (open a pull request for review, but merge into mainline immediately), or Ask (open a pull request for discussion before merging).
Right, its like saying checklists are a sign of distrust. No, its an acknowledgement that good intentions dont fix human nature. Have an agreed upon process and stick to it.
Checklists are often a good thing; and an opportunity to optimize processes with team feedback!
"Post-surgical deaths in Scotland drop by a third, attributed to a checklist" https://news.ycombinator.com/item?id=19684376 https://westurner.github.io/hnlog/#comment-19684376
Show HN: TweeView – A Tree Visualisation of Twitter Conversations
It would be more useful if one could either follow a link to the replies, or read the full tweet text of the corresponding node by clicking on it.
Also the node div should have a background color, for when an image doesn't load.
As an aside, even though @paulgb says he's not bothered with the use of his library that way, it's so similar that I think common OSS etiquette would've been to acknowledge the original project more prominently.
Cheers. It is all open source on https://github.com/edent/TweeView - where I also credit Paul.
The main 2D view is based on his work - but the interactive view (and 3D views) are based on other projects.
Wireless Charging Power Side-Channel Attacks
The severity and number of side-channel attacks is slowly becoming cripplingly ridiculous. At what point do we just give up with side-channel attacks on consumer devices and assume the mentality that all consumer devices connected to the internet should be treated as insecure by default.
> assume the mentality that all consumer devices connected to the internet should be treated as insecure by default.
"Zero trust security model" https://en.wikipedia.org/wiki/Zero_trust_security_model :
> The main concept behind zero trust is that devices should not be trusted by default, even if they are connected to a managed corporate network such as the corporate LAN and even if they were previously verified.
Checked the power draw a couple years ago on reddit's new design, which was a huge increase from the old design. Should have written a paper with lots of clever words in it, could have been my first scientific contribution!
I didn't because this research has been done long ago and with rsa key operations, which are both more severe if you can capture it and much more difficult to measure. That you can see the difference between multi-megabyte JavaScript pages is a given at that point.
From https://planetfriendlyweb.org/mental-model :
> When you think about how a digital product or website creates an environmental impact, you can think of it creating it in three main ways - through the Packets of data it sends to users, the Platform the product runs on, and the Process used to make the product itself.
From https://sustainableux.com/talks/2018/how-to-build-a-planet-f... :
> SustainableUX: design vs. climate change. Online, Worldwide, Free. The online event for UX, front-end, and product people who want to make a positive impact—on climate-change, social equality, and inclusion
How We Proved the Eth2 Deposit Contract Is Free of Runtime Errors
You know, I'm not really into cryptocurrency, but it does seem like it's contributing to a resurgence of interest in formal methods. So - thank you, cryptocurrency community!
From "Discover and Prevent Linux Kernel Zero-Day Exploit Using Formal Verification" https://news.ycombinator.com/item?id=27442273 :
> [Coq, VST, CompCert]
> Formal methods: https://en.wikipedia.org/wiki/Formal_methods
> Formal specification: https://en.wikipedia.org/wiki/Formal_specification
> Implementation of formal specification: https://en.wikipedia.org/wiki/Anti-pattern#Software_engineer...
> Formal verification: https://en.wikipedia.org/wiki/Formal_verification
> From "Why Don't People Use Formal Methods?" https://news.ycombinator.com/item?id=18965964 :
>> Which universities teach formal methods?
>> - q=formal+verification https://www.class-central.com/search?q=formal+verification
>> - q=formal+methods https://www.class-central.com/search?q=formal+methods
>> Is formal verification a required course or curriculum competency for any Computer Science or Software Engineering / Computer Engineering degree programs?
To clarify, they implemented the algorithm in Dafny, and then proved that version correct. They did not verify code that will actually run in production.
From the paper:
> Dafny is a practical option for the verification of mission-critical smart contracts, and a possible avenue for adoption could be to extend the Dafny code generator engine to support Solidity … or to automatically translate Solidity into Dafny. We are currently evaluating these options
https://github.com/dafny-lang/dafny
Dafny Cheat Sheet: https://docs.google.com/document/d/1kz5_yqzhrEyXII96eCF1YoHZ...
Looks like there's a Haskell-to-Dafny converter.
Physics-Based Deep Learning Book
"Physics-based" Deep Learning seems like a misnomer. From the abstract "Deep Learning Applications for Physics" sounds more apt. There definitely is value in transferring standard terminology and methods from physics to deep learning. But from the preview it's unclear if that is the focus.
Indeed, “Deep Learning Based Physics” seems a little more correct.
The common terminology is Physics Informed Neural Networks (PINNs)
"Physics-informed neural networks" https://en.wikipedia.org/wiki/Physics-informed_neural_networ...
But what about statistical thermodynamics and information theory? What about thin film?
What are some applications for PINNs and for {DL, RL,} in physics?
Ask HN: Books that teach you programming languages via systems projects?
Looking for a book/textbook that teaches you a programming language through systems (or vice versa). For example, a book that teaches modern C++ by showing you how to program a compiler; a book that teaches operating systems and the language of choice in the book is Rust; a book that teaches database internals through Golang; etc. Basically, looking for a fun project-based book that I can walk through and spend my free time working through.
Any recommendations?
From "Ask HN: What are some books where the reader learns by building projects?" https://news.ycombinator.com/item?id=26042447 :
> "Agile Web Development with Rails [6]" (2020) teaches TDD and agile in conjunction with a DRY, CoC, RAD web application framework: https://g.co/kgs/GNqnWV
And:
> "ugit – Learn Git Internals by Building Git in Python" https://www.leshenko.net/p/ugit/
How you can track your personal finances using Python
> We take the output of the previous step, pipe everything over to our .beancount file, and "balance" transactions.
> Recall that the flow of money in double-entry accounting is represented using transactions involving at least two accounts. When you download CSVs from your bank, each line in that CSV represents money that's either incoming or outgoing. That's only one leg of a transaction (credit or debit). It's up to us to provide the other leg.
> This act is called "balancing".
Balance (accounting) https://en.wikipedia.org/wiki/Balance_(accounting)
Are unique record IDs necessary for this [financial] application? FWICS, https://plaintextaccounting.org/ just throws away the (probably per-institution) transaction IDs; like a non-reflexive logic that eschews Law of identity? Just grep and wc?
> What does the ledger look like?
> I wrote earlier that one of the main things that Beancount provides is a language specification for defining financial transactions in a plain-text format.
> What does this format look like? Here's a quick example:
option "title" "Alice"
option "operating_currency" "EUR"
; Accounts
2021-01-01 open Assets:MyBank:Checking
2021-01-01 open Expenses:Rent
2021-01-01 * "Landlord" "Thanks for the rent"
Assets:MyBank:Checking -1000.00 EUR
Expenses:Rent 1000.00 EUR
What does the `*` do?The star is just an 1-digit field for "flag". I don't think there are any defined semantics for the field but by convention 'star' means something 'no special flags on this transaction'.
People use other flags to indicate, for instance, whether a transaction has been reconciled, or cleared their bank, or whatnot.
a REST API such as: https://plaid.com/
Plaid has serious privacy and security concerns, so I would be careful:
From https://news.ycombinator.com/item?id=28203393 :
> No, your personal data is not sold or rented or given away or bartered to parties that are not Plaid, your bank, or the connected app. We talk about all of this in our privacy policy, including ways that data could be used — for example, with data processors/service providers (like AWS which hosts our services) for the purposes of running Plaid’s services or for a user’s connected app to provide their services.
>> I saw that. Thank you for your patience and persistence in responding to so many pointed questions.
>> For any interested, here is a link to relevant section of the referenced privacy policy: https://plaid.com/legal/#consumers
>> I am also impressed by the Legal Changelog on the same page that clearly lays out a log of changes made to privacy & other published legal documents.
The comments in those threads were more negative than positive, and the fact that Plaid paid a $58 million settlement for allegedly sharing personal banking data with third parties without consent is telling enough. I am not going to give my banking usernames and passwords to Plaid in plaintext, when their employee is arguing on HN over what the word "sold" means:
Are you making claims without evidence? Settling is not admission of guilt.
Banks should implement read-only OAuth APIs, so that users are not required to store their u/p/sqa answers.
From "Canada calls screen scraping ‘unsecure,’ sets Open Banking target for 2023" https://news.ycombinator.com/item?id=28229957 :
> AFAIU, there are still zero (0) consumer banking APIs with Read-Only e.g. OAuth APIs in the US as well?
Looks like there may be less than 3 so far.
> Banks could save themselves CPU, RAM, bandwidth, and liability by implementing read-only API tokens and methods that need only return JSON - instead of HTML or worse, monthly PDF tables for a fee - possibly similar to the Plaid API: https://plaid.com/docs/api/
> There is competition in consumer/retail banking, but still the only way to do e.g. budget and fraud analysis with third party apps is to give away all authentication factors: u/p/sqa; and TBH that's unacceptable.
> Traditional and distributed ledger service providers might also consider W3C ILP: Interledger Protocol (in starting their move to quantum-resistant ledgers by 2022 in order to have a 5 year refresh cycle before QC is a real risk by 2027, optimistically, for science) when reviewing the entropy of username+password_hash+security_question_answer strings in comparison to the entropy of cryptoasset account public key hash strings: https://interledger.org/developer-tools/get-started/overview...
> Are you making claims without evidence?
No. Plaid did agree to pay the $58 million, and the lawsuit was for alleged data sharing with third parties without user consent. I don't care if they admit guilt or not. They agreed to pay $58 million to end the lawsuit, and that does not engender trust. Shifting the blame to banks doesn't make Plaid any more reputable.
Providing usernames and passwords of sensitive accounts to a third party is a privacy and security risk, and Plaid has not earned enough trust from me to justify the risk I would need to assume to use their services.
How did their policies change before and after said settlement?
From https://my.plaid.com/help/360043065354-does-plaid-have-acces... :
> Does Plaid have access to my credentials?
> The type of connection Plaid has to your financial institution determines whether or not we have access to the login credentials for your financial account: your username and password.
> In many cases, when you link a financial institution to an app via Plaid, you provide your login credentials to us and we securely store them. We use those credentials to access and obtain information from your financial institution in order to provide that information, at your direction, to the apps and services you want to use. For more information on how we use your data, please refer to our End User Privacy Policy.
> In other cases, after you request that we link your financial institution to an app or service you want to use, you will be prompted to provide your login credentials directly to your financial institution––not to Plaid––and, upon successful authentication, your financial institution will then return your data to Plaid. In these cases, Plaid does not access or store your account credentials. Instead, your financial institution provides Plaid with a type of security identifier, which permits Plaid to securely reconnect to your financial institution at regularly scheduled intervals to keep your apps and services up-to-date.
> Regardless of which type of connection is made, we do not share your credentials with the apps or services you’ve linked to your financial institution via Plaid. You can read more about how Plaid handles data here.
What do you think this should say instead?
Do you think they use the same key to securely store all accounts, like ACH? Or no key, like the bank ledger that you're downloading a window of as CSV through hopefully a read-only SQL account, hopefully with data encrypted at rest and in motion.
When you download a CSV or a OFX to a local file, is the data then still encrypted at rest?
Again, US Banks can eliminate the need for {Plaid, Mint, } as the account data access middlemen by providing a read-only OAuth API. Because banks do not have a way to allow users to grant read-only access to their account ledgers, the only solution is to securely store the u/p/sqa. If you write a script to fetch your data and call it from cron, how can you decrypt the account credentials after an unattended reboot? When must a human enter key material to decrypt the stored u/p/sqa?
Here, we realize that banks should really have people that do infosec - that comprehend symmetric and assymetric cryptography - audits to point out these sorts of vulnerabilities and risks. And if they had kept current with the times, we would have a very different banking and finance information system architecture with fewer single points of failure.
I'm not interested in what Plaid puts in a help page, since Plaid's $58 million settlement is for alleged data sharing with third parties without consent, meaning that Plaid is accused of not properly communicating the alleged data sharing to its users or obtaining permission.
And Plaid's terms of service (https://plaid.com/legal/#how-we-use-your-information) contains vague catch-alls such as:
> We share your End User Information for a number of business purposes:
> With our data processors and other service providers, partners, or contractors in connection with the services they perform for us or developers
Sure, it would be great if banks offered different authentication systems, but that has nothing to do with my lack of trust for Plaid. A different authentication system wouldn't eliminate the data sharing concerns I have with Plaid.
CISA Lays Out Security Rules for Zero Trust Clouds
"Cloud Security Technical Reference Architecture (TRA)" (2021) https://cisa.gov/publication/cloud-security-technical-refere...
> The Cloud Security TRA provides agencies with guidance on the shared risk model for cloud service adoption (authored by FedRAMP), how to build a cloud environment (authored by USDS), and how to monitor such an environment through robust cloud security posture management (authored by CISA).
> Public Comment Period - NOW OPEN! CISA is releasing the Cloud Security TRA for public comment to collect critical feedback from agencies, industry, and academia to ensure the guidance fully addresses considerations for secure cloud migration. The public comment period begins Tuesday, September 7, 2021 and concludes on Friday, October 1, 2021. CISA is interested in gathering feedback focused on the following key questions: […]
"Zero Trust Maturity Model" (2021) https://cisa.gov/publication/zero-trust-maturity-model
> CISA’s Zero Trust Maturity Model is one of many roadmaps for agencies to reference as they transition towards a zero trust architecture. The goal of the maturity model is to assist agencies in the development of their zero trust strategies and implementation plans and present ways in which various CISA services can support zero trust solutions across agencies.
> The maturity model, which include five pillars and three cross-cutting capabilities, is based on the foundations of zero trust. Within each pillar, the maturity model provides agencies with specific examples of a traditional, advanced, and optimal zero trust architecture.
> Public Comment Period – NOW OPEN! CISA drafted the Zero Trust Maturity Model in June to assist agencies in complying with the Executive Order. While the distribution was originally limited to agencies, CISA is excited to release the maturity model for public comment.
> CISA is releasing the Zero Trust Maturity Model for public comment beginning Tuesday, September 7, 2021 and concludes on Friday, October 1, 2021. CISA is interested in gathering feedback focused on the following key questions: […]
"Zero trust security model": https://en.wikipedia.org/wiki/Zero_trust_security_model
Show HN: Heroku Alternative for Python/Django apps
I have been following this space alot. Heroku like deployment spaces have exploded recently partly because people miss that experience, partly because buildpack is opensource[1], partly because we have gone through a techshift where it is pretty easy to emulate that. There are two broad categories - 1. Dokku's of the world, one VM (very recently tried adding multi node) and then more recently getporter.dev[2] and okteto[3] of the world that is trying to replicate heroku on kubernetes.
One of things, I noticed is that, It looks really cool at first look but every organisation has their own complexity in deploying stuff. It is an integration hell, you have to provide white glove treatment to all customers and eventually you have a hard time scaling. We tried it too, and what we noticed if you have to go an enterprise and sell, you start doing leaky abstraction. It works really well for early stage startups but once they have the money you prefer something more customizable and robust solution.
This project looks cool. I do not want to discourage anyone but again, it is a red ocean out there.
The problem is the layering of the abstractions. Having PaaS on top of Kubernetes in a way that you move down to a more powerful primitive is going to be the most scalable.
What happened in the public cloud space was a disparate set of services which made the choices really costly. Azure went in heavy on PaaS while AWS was focused on VMs. The easy adoption and migration was on AWS. K8s seems like a reasonable abstraction that reduces the cost of these decisions because people seem to agree that K8s is a reasonable base.
Being able to slide on a compatible PaaS layer shouldn't negate that K8s investment, and make it easier to adopt for some organizations, or parts of those organizations. But, I wouldn't argue that we are the point where that layer is mature.
dokku-scheduler-kubernetes https://github.com/dokku/dokku-scheduler-kubernetes#function...
> The following functionality has been implemented: Deployment and Service annotations, Domain proxy support via the Nginx Ingress Controller, Environment variables, Letsencrypt SSL Certificate integration via CertManager, Pod Disruption Budgets, Resource limits and reservations (reservations == kubernetes requests), Zero-downtime deploys via Deployment healthchecks, Traffic to non-web containers (via a configurable list)
SPDX Becomes Internationally Recognized Standard for Software Bill of Materials
From OP:
> Between eighty and ninety percent (80%-90%) of a modern application is assembled from open source software components. An SBOM accounts for the software components contained in an application — open source, proprietary, or third-party — and details their provenance, license, and security attributes. SBOMs are used as a part of a foundational practice to track and trace components across software supply chains. SBOMs also help to proactively identify software issues and risks and establish a starting point for their remediation.
> SPDX results from ten years of collaboration from representatives across industries, including the leading Software Composition Analysis (SCA) vendors – making it the most robust, mature, and adopted SBOM standard.
https://en.wikipedia.org/wiki/Software_Package_Data_Exchange
Show HN: Arxiv.org on IPFS
Ah cool… I also took a stab at something similar several years ago: https://github.com/ecausarano/heron
Also at the time I was considering IPFS.
But I guess the real trick is implementing a WOT to implement peer review and filter out the inevitable junk that will be published
"Help compare Comment and Annotation services: moderation, spam, notifications, configurability" executablebooks/meta#102 https://github.com/executablebooks/meta/discussions/102 :
> jupyter-comment supports a number of commenting services [...]. In helping users decide which commenting and annotation services to include on their pages and commit to maintaining, could we discuss criteria for assessment and current features of services?
> Possible features for comparison:
> * Content author can delete / hide
> * Content author can report / block
> * Comments / annotations are screened by spam-fighting service
> * Content / author can label as e.g. toxic
> * Content author receives notification of new comments
> * Content author can require approval before user-contributed content is publicly-visible
> * Content author may allow comments for a limited amount of time (probably more relevant to BlogPostings)
> * Content author may simultaneously denounce censorship in all it's forms while allowing previously-published works to languish
#ForScience
FWIW, archiving repo2docker-compatible git repos with a DOI attached to a git tag, is possible with JupyterLite:
> JupyterLite is a JupyterLab distribution that runs entirely in the browser built from the ground-up using JupyterLab components and extensions
With JupyterLite, you can build a static archive of a repo2docker-like environment so that the ScholarlyArticle notebook or computer modern latex css, its SoftwareRelease dependencies, and possibly also the Datasets can be run in a browser tab with WASM. HTML + JS + WASM
New Texas Abortion Law Likely to Unleash a Torrent of Lawsuits Against Education
All this talk about Austin becoming the new SV - but then you see regressive laws like this getting pushed through. No thanks.
Migration of people from SV to Austin will change the political landscape and stop stuff like this from being passed.
You don't have to pay prisoners for work in Texas.
Unfortunately, I find it necessary to help elucidate some moral issues that recent legal developments in one state of the Union have brought to light.
Such a God is correctly assessed as hostile to the patrons of Earth.
Such a creation myth involving seven (7) days to create Heaven, Hell Lite, and Hell (Heaven, Earth, and Hell) is what we must interpret.
At issue is whether the singular, supernatural God in Heaven is Benevolent (Good) and Omnipotent (All-knowing). Why would a loving God create Heaven without suffering on the first day, and then create Earth etc. with suffering on the succeeding days? Why not create Heaven for all? Why send infants into a life of suffering on Earth (Hell Lite) when they could be sent directly to Heaven with no suffering?
Ergo, God's plan is to force babies to suffer on Earth for awhile; but if they die sinless, then they go to Heaven where parents aren't needed. Ergo, Heaven is full of parentless dead babies.
You've made asses of yourselfs in trying to manipulate others into your golden candlestick tax districts, where the whole congregation does not have healthcare.
God created a world in need of healthcare, and isn't getting healthcare to it.
Ergo, because God created a world of suffering for us - and babies - god is not Friendly. And if god is not friendly, god is a very poor basis for our morality, and for our legal system.
Case in point: did god ask for Mary's (or even Joseph's) consent?
A benevolent god would design in required consent.
In the Christian bible, furthermore, god tells them to kill even the enemies' women and unborn children. Ergo, god does not value the sanctity of infants' lives over other concerns.
For citations of scripture, see e.g. Freedom from Religion Foundation > What Does the Bible Really Say About Abortion? https://ffrf.org/component/k2/item/25602-abortion-rights
A very poor basis for morality and law, indeed.
What is "Original Sin" if not a suggestion that we should choose very wisely?
And when they rolled that [silver] rock from the tomb, behold what did they find? No dead baby; for the dead baby of god was in heaven, gentlemen. Sinless dead babies all go to heaven, gentlemen.
I think we can all agree that preventing unintended pregnancy is a worthwhile objective; though we have very differing views on whether abstinence-based education works. States with higher rates of religiosity (by church attendance) have higher rates of teen pregnancy: there is a somewhat strong positive correlation.
A standing prayer request: Pray that we might get healthcare to our congregation.
IDK, what do we say here? We're going to start needing to be making some changes?
Roman society context on this one:
Vestal virgins: https://en.wikipedia.org/wiki/Vestal_Virgin
Baiae: https://en.wikipedia.org/wiki/Baiae
https://pbsinternational.org/programs/underwater-pompeii/ :
> Baiae: an ancient Roman city lost to the same volcanoes that entombed Pompeii. But unlike Pompeii, Baiae sits under water, in the Bay of Naples. Nearly 2,000 years ago, the city was an escape for Rome’s rich and powerful elite, a place where they were free of the social restrictions of Roman society. But then the city sank into the ocean, to be forgotten in the annals of history. Now, a team of archaeologists is mapping the underwater ruins and piecing together what life was like in this playground for the rich. What made Baiae such a special place? And what happened to it?
Woe! Woe unto the obviously promiscuous.
DARPA grant to work on sensing and stimulating the brain noninvasively [video]
Won't this mean that you cannot think of anything else while operating the equipment?
Do you need to not think of anything else to move your arms?
Ex-BCI research student here.
The nerves that control your arm are wired directly into your central nervous system and can active your muscles in direct response to action potentials fanned out from a single neuron.
These noninvasive technologies pick up aggregate signals from hundreds of thousands of neurons all firing pseudo-randomly, filtered spatially by the dura (thick layer of membrane that protects the brain), skull, flesh, skin and hair.
A more "HN" analogy would be like comparing a directly wired Ethernet connection (nerves connected to the CNS) to a side channel attack on an air-gapped system (noninvasive EEG). While randomly running background processes (normally) won't affect the data transmitted to the network, they will absolutely foil most side channel attacks.
What about with realtime NIRS with an (inverse?) scattering matrix? From https://www.openwater.cc/technology :
> Below are examples of the image quality we have achieved with our breakthrough scanning systems that use just red and near-infrared light and ultrasound pings.
https://en.wikipedia.org/wiki/Near-infrared_spectroscopy
Another question: is it possible to do ah molecular identification similar to idk quantum crystallography with photons of any wavelength, such as NIRS? Could that count things in samples?
https://twitter.com/westurner/status/1239012387367387138 :
> ... quantum crystallography: https://en.wikipedia.org/wiki/Quantum_crystallography There's probably some limit to infrared crystallography that anyone who knows anything about particles and lattices would know about ?
> What about with realtime NIRS with an (inverse?) scattering matrix?
I've purged a lot of the knowledge from that time from my brain, but from what I recall fNIRS takes a long time (on the order of multiple seconds) to take a single reading.
It also only really shows which regions of the brain are receiving more blood supply. While a huge improvement in spacial precision over EEG, it's still not anywhere near the same level of precision that a directly-wired nerve has.
That doesn't mean it's useless technology, just probably not viable for the low-latency, high-accuracy control you'd want for a military drone.
>Another question: is it possible to do ah molecular identification similar to idk quantum crystallography with photons of any wavelength, such as NIRS? Could that count things in samples?
I have absolutely no idea about molecular identification or quantum crystallography, so probably can't give you a good answer on that one.
The newer systems can sample much faster (10-100 Hz), but they're still limited by the fact that they're measuring a hemodynamic signal that lags behind neural activity.
Which other strong and weak forces could [photonic,] sensors detect?
IIUC, they're shooting for realtime MRI resolution with NIRS; to be used during surgery to assist surgery in realtime.
edit: https://en.wikipedia.org/wiki/Neural_oscillation#Overview says brainwaves are 1-150 Hz? IIRC compassion is acheivable on a bass guitar.
A couple of studies linked in Wikipedia for this use (one retracted):
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4049706/
Retracted: https://journals.plos.org/plosbiology/article?id=10.1371/jou...
Table with resolution differences between different techniques:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8116487/table/T...
Most NIRS systems use continuous-wave (CW) systems detecting blood oxygenation/chromophore concentration. I say this because, in the time domain, you're looking at 5-7 seconds to see changes in cerebral oxygenation.
It's also noteworthy that you're dealing with huge amounts of noise, very low resolution, and are limited to the cortical surface. This limits the applications in the BCI-domain.
I'm currently researching the feasability of fast optical imaging. This has to be done in the frequency domain, but may yield temporal resolution in the milliseconds. The downside is finding an incoherent light source that's able to be modulated fast enough to detect the scattering changes.
You mentioned "time-domain", and I recalled "time-polarization".
From https://twitter.com/westurner/status/1049860034899927040 :
https://web.archive.org/web/20171003175149/https://www.omnis...
"Mind Control and EM Wave Polarization Transductions" (1999)
> To engineer the mind and its operations directly, one must perform electrodynamic engineering in the time * domain, not in the 3-space EM energy density domain.*
Could be something there.
Topological Axion antiferromagnet https://phys.org/news/2021-07-layer-hall-effect-2d-topologic... :
> Researchers believe that when it is fully understood, TAI can be used to make semiconductors with potential applications in electronic devices, Ma said. The highly unusual properties of Axions will support a new electromagnetic response called the topological magneto-electric effect, paving the way for realizing ultra-sensitive, ultrafast, and dissipationless sensors, detectors and memory devices.
Optical topological antennas https://engineering.berkeley.edu/news/2021/02/light-unbound-... :
> The new work, reported in a paper published Feb. 25 in the journal Nature Physics, throws wide open the amount of information that can be multiplexed, or simultaneously transmitted, by a coherent light source. A common example of multiplexing is the transmission of multiple telephone calls over a single wire, but there had been fundamental limits to the number of coherent twisted light waves that could be directly multiplexed.
Rydberg sensor https://phys.org/news/2021-02-quantum-entire-radio-frequency... :
> Army researchers built the quantum sensor, which can sample the radio-frequency spectrum—from zero frequency up to 20 GHz—and detect AM and FM radio, Bluetooth, Wi-Fi and other communication signals.
> The Rydberg sensor uses laser beams to create highly-excited Rydberg atoms directly above a microwave circuit, to boost and hone in on the portion of the spectrum being measured. The Rydberg atoms are sensitive to the circuit's voltage, enabling the device to be used as a sensitive probe for the wide range of signals in the RF spectrum.
> "All previous demonstrations of Rydberg atomic sensors have only been able to sense small and specific regions of the RF spectrum, but our sensor now operates continuously over a wide frequency range for the first time,"
Sometimes people make posters or presentations for new tech, in medicine.
The xMed Exponential Medicine conference / program is in November this year: https://twitter.com/ExponentialMed
Space medicine also presents unique constraints that more rigorously select from possible solutions: https://en.wikipedia.org/wiki/Space_medicine
There is no progress in medicine without volunteers for clinical research trials. https://en.wikipedia.org/wiki/Phases_of_clinical_research
New Ways to Be Told That Your Python Code Is Bad
> Python programmers in general have an irrational aversion to if-expressions, a.k.a. the “ternary” operator.
Because the Python ternary operator is fucking backwards! Why on earth would you put the condition in the middle!?
x = 4 if condition() else 5
vs.: condition() ? 4 : 5
vs. the author's own lisp example: (setq x (if (condition) 4 5))
I love ternary operators, and Lisp/ML-style if else blocks that return things, but Python puts the 4 and the 5 so far away from each other that even I hate using it, especially when condition is a chained mess rather than just a function call.As I recall, object? and object?? are and work in IPython because the Python mailing list said that the ternary operator was not reserved. (IIRC there was yet no formal grammar or collections.abc or maybe even datetime or json yet at the time).
Ternary expressions on one line require branch coverage to be enabled in your e.g. pytest; otherwise it'll look like the whole line is covered by tests when each branch on said line hasn't actually been tested.
.get() -> Union[None, T]Web-based editor
It looks like this is actually just the editor from Visual Studio Online unless I've missed something.
It's great, but if it were really Visual Studio Code that would be awesome and I'd be pleasantly surprised.
(The difference being that if it's just the editor it misses out on all the compiler/analysis integrations. If Github were providing Linux containers in the cloud for working on projects - essentially the SSH feature from Visual Studio Code - it would be absolutely brilliant.)
I think what you might want is GitHub Codespaces?: https://github.com/features/codespaces. Was on HN recently too.
Hmm. $0.18 per hour for 4GB of RAM is a rather steep. A c6g.large on AWS would run you less than a half of that, for the same number of CPUs and memory. A quarter if you're happy to use spot instances. Might be neat to have an OSS tool that spins up an EC2 (or Azure or GCP) instance running Code-Server[1] (VS Code running natively on the server, but presenting the UI in the browser), with a given git repo and credentials.
Admittedly, the storage costs would be higher.
The ml-workspace docker image includes Git, Jupyter, VS Code, SSH, and "many popular data science libraries & tools" https://github.com/ml-tooling/ml-workspace
docker run -p 8080:8080 -v "${PWD}:/workspace" mltooling/ml-workspace
Cocalc-docker also includes Git, Jupyter, SSH, a collaborative LaTeX editor, a time slider, but no code-server or VScode out of the box:
https://github.com/sagemathinc/cocalc-docker docker run --name=cocalc -d -v ~/cocalc:/projects -p 443:443 sagemathinc/cocalcGitHub Copilot Generated Insecure Code in 40% of Circumstances During Experiment
For comparison, what percentage of human-generated code is secure?
> For comparison, what percentage of human-generated code is secure?
Yeah how did they measure? Did static and dynamic analysis find design bugs too?
Maybe - as part of a Copilot-assisted DevSecOps workflow involving static and dynamic analysis run by GitHub Actions CI - create Issues with CWE "Common Weakness Enumeration" URLs from e.g. the CWE Top 25 in order to train the team, and Pull Requests to fix each issue?: https://cwe.mitre.org/top25/
Which bots send PRs?
AAS Journals Will Switch to Open Access
Publication charges for authors and students will probably rise for a factor of 2 or more... open access journals in technical fields are approaching $500 per page.
This will basically lock out people in the developing world from publishing in these journals.
> JOSS (Journal of Open Source Software) has managed to get articles indexed by Google Scholar [rescience_gscholar]. They publish their costs [joss_costs]: $275 Crossref membership, DOIs: $1/paper:
>> Assuming a publication rate of 200 papers per year this works out at ~$4.75 per paper
> [joss_costs]: https://joss.theoj.org/about#costs
^^ from https://news.ycombinator.com/item?id=24517711 & this log of my non- markdown non- W3C Web Annotation threaded comments with URIs: https://westurner.github.io/hnlog/#comment-24517711
Does anyone here have thoughts on JOSS? I've reviewed for them once, had the impression the editors take their job seriously, and think I'll review again in the future. The review approach that the journal facilitates has a strong focus on engineering aspects, i.e. it addresses a weakness of other venues, where it often does not matter how messy, unstable, and poorly documented the code is (or even if it compiles). On the other hand, the JOSS reviewers are typically not experts on the problem that the software is solving.
I'm currently reviewing for JOSS, and have done so before. In many ways they're a very strange journal: the paper is nearly an afterthought, and the review is focused on the code. But I like them. As you say, the editors take their role seriously. And it seems to have two valuable contributions.
Firstly, encouraging and structuring code review in academia. My own code is almost entirely solo (and messy), so a venue for structured review and an incentive to robustify public code is good. Secondly, the culture in some disciplines is that code is not citable, only papers - and JOSS is an end-run around this. I hope this second situation is changing, but we're not there yet so JOSS has a valuable role for the moment in simply being a 'journal' assigning DOIs basically for code packages.
[Scholarly] Code review tools; criteria and implementations?
Does JOSS specify e.g. ReviewBoard, GitHub Pull Request reviews, or Gerrit for code reviews?
The reviews for JOSS happen on github[0] but the journal's not prescriptive about how you develop your package as long as the code is public. The criteria for the JOSS review are very clear[1].
I don't want to oversell the depth of the code review possible; not all of the reviewers will be fully expert in whatever tiny cutting-edge area the package is for (making correctness checks difficult beyond the test suite), and most of us are academics-who-code rather than research software engineers. But the fact it's happening at all is a great step forward.
[0]: https://github.com/openjournals/joss-reviews/issues [1]: https://joss.readthedocs.io/en/latest/review_criteria.html
Thanks for the citations. Looks like Wikipedia has "software review" and "software peer review":
https://en.wikipedia.org/wiki/Software_review
https://en.wikipedia.org/wiki/Software_peer_review
I'd add "Antipatterns" > "Software" https://en.wikipedia.org/wiki/Anti-pattern#Software_design
and "Code smells" > "Common code smells" https://en.wikipedia.org/wiki/Code_smell#Common_code_smells
and "Design smells" for advanced reviewers: https://en.wikipedia.org/wiki/Design_smell
and the CWE "Common Weakness Enumeration" numbers and thus URLs for Issues from the CWE Top 25 and beyond: https://cwe.mitre.org/top25/
FWIW, many or most scientists are not even trying to be software engineers: they just write slow code without reusing already-tested components and expect someone else to review Pull Requests after their PDF is considered impactful. They know enough coding to push the bar for their domain a bit higher each time.
Are there points for at least in-writing planing for the complete lifecycle and governance of an ongoing thesis defense of open source software for science; after we publish, what becomes of this code?
From https://joss.theoj.org/about#costs :
> Income: JOSS has an experimental collaboration with AAS publishing where authors submitting to one of the AAS journals can also publish a companion software paper in JOSS, thereby receiving a review of their software. For this service, JOSS receives a small donation from AAS publishing. In 2019, JOSS received $200 as a result of this collaboration.
A low cost of $1/page just shows how much of a scam even prestigious journals have become for all involved parties.
Referees work for free. Submitting authors pay through the nose for presumably typesetting and proofs whose true cost is closer to $1/page rather than hundreds. Editors are overworked.
Sci-Hub was supposed to disrupt that industry, but it seems all it's done is shift burden to authors and created pay-for publishing.
Moderation costs money, too.
Additional ScholarlaryArticle "Journal" costs: moderation, BinderHub / JupyterLite white label SaaS?, hosting data and archived reproducible container images on IPFS and academictorrents and Git LFS, hosting {SQL, SPARQL, GraphQL,} queries and/or a SOLID HTTPS REST API and/or RSS feeds with dynamic content but static feed item URIs and/or ActivityStreams and/or https://schema.org/Action & InteractAction & https://schema.org/ReviewAction & ClaimReview fact check reviews, W3C Web Notifications, CRM + emailing list, keeping a legit cohort of impactful peer reviewers,
#LinkedData for #LinkedResearch: Dokieli, parsing https://schema.org/ScholarlyArticle citation styles,
> keeping a legit cohort of impactful peer reviewers, [who are time-constrained and unpaid, as well]
"Ask HN: How are online communities established?" https://news.ycombinator.com/item?id=24443965 re: building community, MCOS Marginal Cost of Service, CLV Customer Lifetime Value, etc
White House Launches US Digital Corps
I've worked with state government as a volunteer advisor. They're still developing everything with waterfall. Only contracting out to big firms, even if it's a small project. Lawmakers and aides sit in a room and write down what is to be done.
That's changing at the federal level. They know they've got a problem. Why shouldn't federal software be as easy to use as the best web software? If you've ever tried to use it you will quickly learn that isn't the case.
Some sites will only work with IE and no other browser. Developers in two years can make a huge difference for making the government be more agile and operate better.
I always suggest joining a local Code For America brigade. Work on a local project and see if it is for you. If you find yourself drawn to it then consider applying for a two year stint with the federal government. You can really make a difference!
> I've worked with state government as a volunteer advisor. They're still developing everything with waterfall. Only contracting out to big firms, even if it's a small project. Lawmakers and aides sit in a room and write down what is to be done.
The US Digital Services Playbook likely needs few modifications for use at state and local levels? https://github.com/usds/playbook#readme
"PLAY 1: Understand what people need" https://playbook.cio.gov/#play1
"PLAY 4: Build the service using agile and iterative practices" https://playbook.cio.gov/#play4
Do [lawmakers and aides] make good "Product Owners", stakeholders, [incentivized, gamified] app feedback capability utilizers? GitLab has Service Desk: you can email into the service desk email without having an account as necessary to create and follow up on [software] issues in GitHub/BitBucket/GitLab/Gitea project management sytems.
> That's changing at the federal level. They know they've got a problem. Why shouldn't federal software be as easy to use as the best web software? If you've ever tried to use it you will quickly learn that isn't the case.
"PLAY 3: Make it simple and intuitive" https://playbook.cio.gov/#play3
> Some sites will only work with IE and no other browser. Developers in two years can make a huge difference for making the government be more agile and operate better.
US Web Design Standards https://designsystem.digital.gov/
From https://github.com/uswds/uswds#browser-support :
>> We’ve designed the design system to support older and newer browsers through progressive enhancement. The current major version of the design system (2.0) follows the 2% rule: we officially support any browser above 2% usage as observed by analytics.usa.gov. Currently, this means that the design system version 2.0 supports the newest versions of Chrome, Firefox, Safari, and Internet Explorer 11 and up.
> I always suggest joining a local Code For America brigade. Work on a local project and see if it is for you. If you find yourself drawn to it then consider applying for a two year stint with the federal government. You can really make a difference!
From https://en.wikipedia.org/wiki/Code_for_America :
>> [...] described Code for America as "the technology world's equivalent of the Peace Corps or Teach for America". The article goes on to say, "They bring fresh blood to the solution process, deliver agile coding and software development skills, and frequently offer new perspectives on the latest technology—something that is often sorely lacking from municipal government IT programs. This is a win-win for cities that need help and for technologists that want to give back and contribute to lower government costs and the delivery of improved government service."
Launch HN: Litnerd (YC S21) – Teaching kids to read with the help of live actors
Hi HN, my name is Anisa and I am the founder of Litnerd (https://litnerd.com/), an online reading program designed to teach elementary school students in America how to read.
There are 37M elementary school students in America. Schools spend $20B on reading and supplemental education programs. Yet 42% of 4th grade students are reading at a 1st or 2nd grade proficiency level! The #1 reason students aren’t reading? They say it’s boring. We change that by bringing books to life. Think your favorite book turned into a tv-show style episode-by-episode reenactment, coupled with a complete curriculum and lesson plans.
1 in 8 Americans is functionally illiterate. Like any skill, reading is a habit. If you grew up in a household where you did not see your parents reading, you likely do not develop the habit. This correlates to the socio-economic divide. Two thirds of American students who lack reading skills by the end of fourth grade will rely on welfare as adults. To impact this, research suggests that we need to start at the earliest years.
I am passionate about the research in support of art and theatre as well as story-telling to improve childhood learning. Litnerd is the marriage of these interests. The inspiration comes from Sesame Street and Hamilton The Musical. In the late 60s, Joan Cooney decided to produce a children’s TV show that would influence children across America to learn to read—it became Sesame Street. Cooney researched her idea extensively, consulting with sociologists and scientists, and found that TV’s stickiness can be an important tool for education. Lin-Manuel Miranda took the story of Alexander Hamilton and brought it to life as a musical. Kids have learned more about Hamilton’s history thanks to Hamilton the Musical than any of their textbooks. In fact, this was the case so much that a program called EduHam is used to teach history in middle schools across the nation. When I heard that, the lightbulb went off and I decided to go all in on starting Litnerd.
We hire art and theatre professionals to recreate scenes directly from books in episode style format to bring the book to life, in a similar fashion to watching your favorite TV shows. We literally lead 'read out loud' in the classroom while the teacher/actor is acting out the main character in the book. We have a weekly designated Litnerd period in the schools/classes we serve and we live-stream in our teachers/actors for an interactive session (the students participate and read live with the actor as well as complete written lesson plans, phonetic exercises etc). We are currently serving 14,000 students in this manner.
The format of our program is such that if you don't complete the assigned reading and worksheets, you will feel like you are missing out on what is happening in later episodes. In this way, reading is layered in as a fundamental core to the program. Our program is part of scheduled classroom time.
A big part of our business involves curating content and materials that capture the interest and coolness-factor for elementary school students. We’ve found that students love choose-your-own-adventure style stories, especially ones involving mythical creatures—something about being able to have autonomy on the outcomes. So far, it seems to be working. We've even received fan mail from students! But we are obsessed with staying cool/relevant in our content.
Teachers like our product because it eases the burden placed on them. US teachers typically spend 4 to 10 hours a week (unpaid) planning their curriculum and $400-800 of their own money for classroom supplies. That's outrageous! When designing Litnerd, we wanted to ensure our product was not adding more work to their plate. Our programs are led by our own Resident Teaching Artists, who are live streamed into the classroom and remain in character to the episode as they teach the Litnerd curriculum built on top of the books. Our programs come with lesson plans, activity packets, curriculum correlations, educator resources, and complete ebooks.
Traditional K-12 education has extremely long sale cycles and is hard to break into. It can take years to become a contracted vendor, especially with large districts like NYC Department of Education. Because of my experience with my first YC backed startup that sold to government and nonprofits, coupled with my experience working at a large edtech company that built content for Higher Ed, I understand this sector and how to navigate the budget line item process.
Since launching in January, we have become contracted vendors with the New York City Department of Education (the largest education district in America). As a result, we’ve been growing at 60% MoM, are currently used by over 14k students in their classrooms and hit $110K in ARR. Our program is part of scheduled classroom time for elementary schools—not homework, and not extracurricular. Here’s a walkthrough video from a teacher’s perspective: https://www.loom.com/share/9ffc59f0d7ed4a66964003703bba7b94.
I am so grateful for the opportunity to share our story and mission with you. If you loved or struggled with reading as a kid, what factors do you think contributed? Also, if you have experience teachIng Elementary School or if you are a parent, I would love to hear your thoughts and ideas on how you foster reading amongst your students/children! I am excited to hear your feedback and ideas to help us inspire the next generation of readers.
This looks like a really thoughtful resource, and I can tell you are hitting all the right marks when it comes to getting this into classrooms. As a first grade teacher, however, I all curious how you see Litnerd fitting into a balanced literacy program. In browsing your programs, I see language comprehension and social emotional learning but no phonics or phonemic awareness. Do you have plans to support these crucial reading skills as well?
I'm thrilled to have our first elementary school teacher comment!! Yes! We have a team of educators and curriculum writers (broken up by Grades PreK-K, 1-2 and 3-5) that follow Common Core standards and build custom ELA curriculum on top of each book and Litnerd program. We provide 2 lesson plans per week and a total of 8 lesson plans per program. We also create SEL lessons (again, based on the book) that follow NYSED competencies and 'Leader in Me' program language (since that is what most of our schools currently use for SEL). We hope to continue to improve overtime and I would definitely any feedback for us as we grow intros area!
TIL a new acronym word symbol lexeme: SEL: Social and Emotional Learning
> Social Emotional Learning (SEL) is an education practice that integrates social emotional skills into school curriculum. SEL is otherwise referred to as "socio-emotional learning" or "social-emotional literacy." When in practice, social emotional learning has equal emphasis on social and emotional skills to other subjects such as math, science, and reading.[1] The five main components of social emotional learning are self-awareness, self management, social awareness, responsible decision making, and relationship skills.
https://en.wikipedia.org/wiki/Social_and_Emotional_Learning
For good measure, Common Core English Language Arts standards: https://en.wikipedia.org/wiki/Common_Core_State_Standards_In...
Khan Academy has 2nd-9th Grade ELA exercises: English & Language Arts: https://www.khanacademy.org/ela
Unfortunately AFAIU there's not a good way to explore the Khan Academy Kids curriculum graph; which definitely does include reading: https://learn.khanacademy.org/khan-academy-kids/
> The app engages kids in core subjects like early literacy, reading, writing, language, and math, while encouraging creativity and building social-emotional skills
In terms of Phonemic awareness and Phonological awareness, is there a good a survey of US and World reading programs and their evidence-based basis, if any??
From https://en.wikipedia.org/wiki/Phonemic_awareness :
> Phonemic awareness is a subset of phonological awareness in which listeners are able to hear, identify and manipulate phonemes, the smallest mental units of sound that help to differentiate units of meaning (morphemes). Separating the spoken word "cat" into three distinct phonemes, /k/, /æ/, and /t/, requires phonemic awareness. The National Reading Panel has found that phonemic awareness improves children's word reading and reading comprehension and helps children learn to spell.[1] Phonemic awareness is the basis for learning phonics.[2]
> Phonemic awareness and phonological awareness are often confused since they are interdependent. Phonemic awareness is the ability to hear and manipulate individual phonemes. *Phonological awareness includes this ability, but it also includes the ability to hear and manipulate larger units of sound, such as onsets and rimes and syllables.*
What are some of the more evidence-based (?) (early literacy,) reading curricula? OTOH: LETRS, Heggerty, PAL: https://www.google.com/search?q=site%3Aen.wikipedia.org+%22l...
Looks like Cambium acquired e.g. Kurzweil Education in 2005?
More context:
Reading readiness in the United States: https://en.wikipedia.org/wiki/Reading_readiness_in_the_Unite...
Emergent literacies: https://en.wikipedia.org/wiki/Emergent_literacies
An interactive IPA chart with videos and readings linked with RDF (e.g. ~WordNet RDF) would be great. From "Duolingo's language notes all on one page" https://westurner.github.io/hnlog/#comment-26430146 :
> An IPA (International Phonetic Alphabet) reference would be helpful, too. After taking linguistics in college, I found these Sozo videos of US english IPA consonants and vowels that simultaneously show {the ipa symbol, example words, someone visually and auditorily producing the phoneme from 2 angles, and the spectrogram of the waveform} but a few or a configurable number of [spaced] repetitions would be helpful: https://youtu.be/Sw36F_UcIn8
> IDK how cartoonish or 3d of an "articulatory phonetic" model would reach the widest audience. https://en.wikipedia.org/wiki/Articulatory_phonetics
> IPA chart: https://en.wikipedia.org/wiki/International_Phonetic_Alphabe...
> IPA chart with audio: https://en.wikipedia.org/wiki/IPA_vowel_chart_with_audio
> All of the IPA consonant chart played as a video: "International Phonetic Alphabet Consonant sounds (Pulmonic)- From Wikipedia.org" https://youtu.be/yFAITaBr6Tw
> I'll have to find the link of the site where they playback youtube videos with multiple languages' subtitles highlighted side-by-side along with the video.
>> [...] Found it: https://www.captionpop.com/
>> It looks like there are a few browser extensions for displaying multiple subtitles as well; e.g. "YouTube Dual Subtitles", "Two Captions for YouTube and Netflix"
Phonics programs really could reference IPA from the start: there are different sounds for the same letters; IPA is the most standard way to indicate how to pronounce words: it's in the old school dictionary, and now it's in the Google "define:" or just "define word" dictionary.
UN Sustainable Development Goal 4: Quality Education: https://www.globalgoals.org/4-quality-education
> Target 4.6: Universal Literacy and Numeracy
> By 2030, ensure that all youth and a substantial proportion of adults, both men and women, achieve literacy and numeracy.
https://sdgs.un.org/goals/goal4 :
> Indicator 4.6.1: Percentage of population in a given age group achieving at least a fixed level of proficiency in functional (a) literacy and (b) numeracy skills, by sex
... Goals, Targets, and Indicators.
Which traversals of a curriculum graph are optimal or sufficient?
You can add https://schema.org/about and https://schema.org/educationalAlignment Linked Data to your [#OER] curriculum resources to increase discoverability, reusability.
Arne-Thompson-Uther Index code URN URIs could be helpful: https://en.wikipedia.org/wiki/Aarne%E2%80%93Thompson%E2%80%9...
> The Aarne–Thompson–Uther Index (ATU Index) is a catalogue of folktale types used in folklore studies.
Are there competencies linked to maybe a nested outline that we typically traverse in depth-first order? https://github.com/todotxt/todo.txt : Todo.txt format has +succinct @context labels. Some way to record and score our own paths objectively would be great.
> What are some of the more evidence-based (?) (early literacy,) reading curricula? OTOH: LETRS, Heggerty, PAL
Looks like there are only 21 search results for: "LETRS" "Fundation" "Heggerty": https://www.google.com/search?q="LETRS"+"fundation"+"heggert...
What is the name for this category of curricula?
Perhaps the US Department of Education or similar could compare early reading programs in a wiki[pedia] page, according to criteria to include measures of evidence-basedness? Just like https://collegescorecard.ed.gov/data/ has "aggregate data for each institution [&] Includes information on institutional characteristics, enrollment, student aid, costs, and student outcomes."
From YouTube, it looks like there are cool hand motions for Heggerty.
Nimforum: Lightweight alternative to Discourse written in Nim
awesome-selfhosted > "Communication - Social Networks and Forums" doesn't yet link to Nimforum: https://github.com/awesome-selfhosted/awesome-selfhosted#com...
An Opinionated Guide to Xargs
Wanting verbose logging from xargs, years ago I wrote a script called `el` (edit lines) that basically does `xargs -0` with logging. https://github.com/westurner/dotfiles/blob/develop/scripts/e...
It turns out that e.g. -print0 and -0 are the only safe way: line endings aren't escaped:
find . -type f -print0 | el -0 --each -x echo
GNU Parallel is a much better tool: https://en.wikipedia.org/wiki/GNU_parallel(author here) Hm I don't see either of these points because:
GNU xargs has --verbose which logs every command. Does that not do what you want? (Maybe I should mention its existence in the post)
xargs -P can do everything GNU parallel do, which I mention in the post. Any counterexamples? GNU parallel is a very ugly DSL IMO, and I don't see what it adds.
--
edit: Logging can also be done with by recursively invoking shell functions that log with the $0 Dispatch Pattern, explained in the post. I don't see a need for another tool; this is the Unix philosophy and compositionality of shell at work :)
Parallel's killer feature is how it spools subprocess output, ensuring that it doesn't get jumbled together. xargs can't do that. I use parallel for things like shelling out to 10000 hosts and getting some statistics. If I use xargs the output stomps all over itself.
Ah OK thanks, I responded to this here: https://news.ycombinator.com/item?id=28259473
As far as I'm aware, xargs still has the problem of multiple jobs being able to write to stdout at the same time, potentially causing their output streams to be intermingled. Compare this with parallels --group.
Also parallels can run some of those threads on remote machines. I don't believe xargs has an equivalent job management function.
In your examples you fail to put 'xargs -P' in the middle of a pipeline: You only put it at the end.
In other words:
some command | xargs -P other command | third command
This is useful if 'other command' is slow. If you buffer on disk, you need to clean up after each task: Maybe there is not enough free disk space to buffer the output of all tasks.UNIX is great in that you can pipe commands together, but due to the interleaving issue 'xargs -P' fails here. It does not live up to the UNIX philosophy. Which is probably why you unconsciously only use it at the end of a pipeline.
You can find a different counterexample on https://unix.stackexchange.com/questions/405552/using-xargs-... I will be impressed if you can implement that using xargs. Especially if you can make it more clean than the paralel version.
[deleted]
Yeah but xargs doesn't refuse to run until I have agreed to a EULA stating I will cite it in my next academic paper.
parallel doesn't either, it just nags. I agree about how silly and annoying it is. Imagine if every time the parallel author opened Firefox he got a message reminding him to personally thank me if he uses his web browser for research, or if every time his research program calls malloc he has to acknowledge and cite Ulrich Drepper. Very very silly.
Parallel is the better tool but the nagware impairs its reputation.
Enhanced Support for Citations on GitHub
> CITATION.cff files are plain text files with human- and machine-readable citation information. When we detect a CITATION.cff file in a repository, we use this information to create convenient APA or BibTeX style citation links that can be referenced by others.
https://schema.org/ScholarlyArticle RDFa and JSON-LD can be parsed with a standard Linked Data parser. Looks like YAML-LD requires quoting e.g. "@context": and "@id":
From https://docs.github.com/en/github/creating-cloning-and-archi... ; in your repo's /CITATION.cff:
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Lisa"
given-names: "Mona"
orcid: "https://orcid.org/0000-0000-0000-0000"
- family-names: "Bot"
given-names: "Hew"
orcid: "https://orcid.org/0000-0000-0000-0000"
title: "My Research Software"
version: 2.0.4
doi: 10.5281/zenodo.1234
date-released: 2017-12-18
url: "https://github.com/github/linguist"
https://citation-file-format.github.io/Canada calls screen scraping ‘unsecure,’ sets Open Banking target for 2023
It's about time. When I learned that applications like YNAB (You Need A Budget) use services like Plaid to connect to my bank account, and that these services literally take my username and password and impersonate me to get my banking data, I was a little sketched out. I use YNAB every day, and having it connected to my bank account is incredibly useful, but if something goes wrong and Plaid loses my money somehow, is there any recourse?
Hopefully individuals will be able to use the Open Banking APIs to access their own data directly, but it looks like accreditation will be required, so probably not.
Here's the full text of the report: https://www.canada.ca/en/department-finance/programs/consult...
Plaid is only one security breach away from being utterly destroyed. And they will take out the financial lives of all their customers with them.
It’s utterly irresponsible and I have no idea how Plaid hasn’t been shut down. You have no recourse if they are breached. The TOS of your online banking probably says that if you disclose your username and password to any third party then you have no liability protections.
This is FUD. Lots of Plaid-based connections only allow reads. This is a regulated industry, and the fallout reputationally might be tough, but consumers are well-protected.
How is that enforced? What is the technical basis that enforces read-only access using user/password auth? Especially since that user/password auth is used by an end user to do "write"-type actions?
Read-only access is not possible. By handing over the credentials you are handing over write access. You are correct.
A couple of my banking institutions let me generate a read-only set of credentials for this sort of purpose.
Citi and Capital One have OAuth flows that Plaid supports, too, which tends to make me angrier at the banks than Plaid; the need for this stuff has been clear for a decade now, but only a few have added OAuth or similar.
AFAIU, there are still zero (0) consumer banking APIs with Read-Only e.g. OAuth APIs in the US as well?
Banks could save themselves CPU, RAM, bandwidth, and liability by implementing read-only API tokens and methods that need only return JSON - instead of HTML or worse, monthly PDF tables for a fee - possibly similar to the Plaid API: https://plaid.com/docs/api/
There is competition in consumer/retail banking, but still the only way to do e.g. budget and fraud analysis with third party apps is to give away all authentication factors: u/p/sqa; and TBH that's unacceptable.
Traditional and distributed ledger service providers might also consider W3C ILP: Interledger Protocol (in starting their move to quantum-resistant ledgers by 2022 in order to have a 5 year refresh cycle before QC is a real risk by 2027, optimistically, for science) when reviewing the entropy of username+password_hash+security_question_answer strings in comparison to the entropy of cryptoasset account public key hash strings: https://interledger.org/developer-tools/get-started/overview...
> Sender – Initiates a value transfer.
> Router (Connector) – Applies currency exchange and forwards packets of value. This is an intermediary node between the sender and the receiver. {MSB: KYC, AML, 10k reporting requirement, etc}
> Receiver – Receives the value
Multifactor authentication: Something you have, something you know, something you are
Multisig: n-of-m keys required to approve a transaction
Edit: from "Fed announces details of new interbank service to support instant payments" https://news.ycombinator.com/item?id=24109576 :
> For purposes of Interledger, we call all settlement systems ledgers. These can include banks, blockchains, peer-to-peer payment schemes, automated clearing house (ACH), mobile money institutions, central-bank operated real-time gross settlement (RTGS) systems, and even more. […]
> You can envision the Interledger as a graph where the points are individual nodes and the edges are accounts between two parties. Parties with only one account can send or receive through the party on the other side of that account. Parties with two or more accounts are connectors, who can facilitate payments to or from anyone they're connected to.
> Connectors [AKA routers] provide a service of forwarding packets and relaying money, and they take on some risk when they do so. In exchange, connectors can charge fees and derive a profit from these services. In the open network of the Interledger, connectors are expected to compete among one another to offer the best balance of speed, reliability, coverage, and cost.
W3C ILP: Interledger Protocol > Peering, Clearing and Settling: https://interledger.org/rfcs/0032-peering-clearing-settlemen...
> Hopefully individuals will be able to use the Open Banking APIs to access their own data directly, but it looks like accreditation will be required, so probably not.
When you loan your money to a bank by depositing ledger dollars or cash - and they, since GLBA in 1999, invest it and offer less than a 1% checking interest rate - and they won't even give you the record of all of your transactions as CSV/OFX `SELECT * FROM transactions WHERE account_id=?`, you have to pay $20/mo per autogenerated PDF containing a table of transactions to scrape with e.g. PDFminer (because they don't keep all account history data online)?
Seemingly OT, but not. APIs for comparison here:
FinTS / HBCI: Home Banking Computer Information protocol https://en.wikipedia.org/wiki/FinTS
E.g. GNUcash (open source double-entry accounting software) supports HBCI (and QIF (Quicken format), and OFX (Open Financial Exchange)). https://www.gnucash.org/features.phtml
HBCI/FinTS has been around in Germany for quite awhile but nowhere else has comparable banking standards? I.e. Plaid may (unfortunately, due to lack of read-only tokens across the entire US consumer banking industry) be the most viable option for implementing HBCI-like support in GNUcash
OpenBanking API Specifications: https://standards.openbanking.org.uk/api-specifications/
Web3 (Ethereum,) APIs: https://web3py.readthedocs.io/en/stable/web3.main.html#rpc-a...
ISO20022 is "A single standardisation approach (methodology, process, repository) to be used by all financial standards initiatives" https://www.iso20022.org/
Brazil's PIX is one of the first real implementers of ISO20022. A note regarding such challenges: https://news.ycombinator.com/item?id=24104351
What data format does the FTC CAT Consolidated Audit Trail expect to receive mandatory financial reporting information in? Could ILP simplify banking and financial reporting at all?
FWIU, RippleNet (?) is the only network that supports attachments of e.g. line-item invoices (that we'd all like to see in the interest of transparency and accountability in government spending).
W3C ILP: Interledger Protocol. See links above.
Of the specs in this loose category, only cryptoledgers do not depend upon (DNS or) TLS/SSL - at the protocol layer, at least - and every CA in the kept-up-to-date trusted CA cert bundle (that could be built from a CT Certificate Transparency log of cert issuance and revocation events kept in a blockchain or e.g. centralized google/trillian, which they have the trusted sole root and backup responsibilities for).
Though, the DNS dependency has probably crept back into e.g. the bitcoind software by now (which used to bootstrap its list of peer nodes (~UNL) from an IRC IP address instead of a DNS domain).
FWIU, each trusted ACH (US 'Direct Deposit') party has a (one) GPG key that they use to sign transaction documents sent over now (S)FTP on scout's honor - on behalf of all of their customers' accounts.
Interactive Linear Algebra (2019)
For those interested in these kinds of interractive math experiences, I have been keeping track of them for a while. Here is my list so far:
• https://www.intmath.com/ - Interactive Mathematics Learn math while you play with it
• http://worrydream.com/LadderOfAbstraction/ - up and down the ladder of abstraction
• https://betterexplained.com/ - Intuitive guides to various things in math
• https://www.math3ma.com/blog/matrices-probability-graphs - Viewing Matrices & Probability as Graphs
• http://immersivemath.com/ila/index.html - immersive linear alg
https://github.com/topics/linear-algebra?l=jupyter+notebook lists "Computational Linear Algebra for Coders" https://github.com/fastai/numerical-linear-algebra
"site:GitHub.com inurl:awesome linear algebra jupyter" lists a few awesome lists with interactive linear algebra resources: https://www.google.com/search?q=site%3Agithub.com+inurl%3Aaw...
3blue1brown's "Essence of linear algebra" playlist has some excellent tutorials with intuition-building visualizations built with manim: https://youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFit...
Git password authentication is shutting down
I'm fine with this change for my usage, I don't think I've used password auth for myself or any automated service I've setup for years now. However, this will introduce more confusion for newcomers who already have to figure out what Git, GitHub, etc are. I just spent some time last weeekend teaching someone the basics of how to create a new project. Such a simple idea required introducing the terminal, basic terminal commands, GitHub, git and its most common commands. It took about 3 hours for us to get through just the most basic pieces. Adding on ssh-keygen, key management adds even more friction.
It's certainly a difficult problem. How can we offer a more gentle learning curve for budding developers while still requiring "real" projects to use best practices for security and development?
I also have found teaching someone how to be even marginally capable of contributing to a Github project from scratch to be a very time consuming and frustrating thing. Think, having your graphics designer able to make commits, or having someone who only wants to update docs.
The worst part is the "easier" solutions are actually just footguns in disguise, as soon as they accidentally click the wrong thing and end up with a detached HEAD, a few commits ahead and behind the REMOTE, and halfway through a botched merge, you have to figure out how to bail them out of that using a GUI you've never actually used. Knowing all this, you either teach them git (high short term pain, high chance of them just giving up immediately) or you tell them to download the first result for "windows foss git gui" and pray that history won't repeat itself.
The solution that everyone actually uses until they learn the unnecessary details is "take a backup of relevant files and blow away & redownload the repo".
wait there is another way?
git stash
git reset --hard master
git pull
git stash popUh oh, you've now merged remote master into your local master which was two commits ahead and 3 behind!
My git config has fast-forward only for pulls and I habitually use --ff-only for any pull :)
The command you want is "git pull --rebase". There are configuration settings to make "git pull" rebase by default, and I'd recommend always turning that on that default (which may be by why the person above omitted it from his pull command).
I actually thought "git pull" did "git pull --rebase" by default (this may be what you get from running `git --configure` without modifications?), but maybe I've just been configuring it that way. You can achieve this in your global Git configuration by setting "pull.rebase" to "true".
I don't think it's sane behavior for "git pull" to do anything else besides rebase the local change onto the upstream branch, so I'm surprised it's not the command's default behavior. Has the project not changed the CLI for compatibility reasons or something?
When do you ever want a "git pull" that's not a rebase? That generates a merge commit saying "Merge master into origin/master" (or something similar) which is stupid. If you really want to use actual branches for some reason, that's fine, but "merge master into master" commits are an anti-pattern that if I ever see in a Git repository I'm working on or responsible for, results in me having a conversation with the author about how to use Git correctly.
I actually don't want to rebase. Rebase can rewrite your local history and make it impossible for you to push without doing a force push (or some other more complicated manuevering). The only time pull with rebase is okay (on a shared branch such as master) is when you know that your local branch is strictly behind remote. That's exactly what --ff-only does. It rebases only if you are behind remote, and declines to do so if you are ahead so you can sort your shit out without rewriting history.
If "git pull --rebase" succeeds without reporting a merge conflict, then it's tantamount to having executed the command with "--ff-only".
I believe you are incorrect however. "git pulll --rebase" will not ever rewrite history from the origin repository. It will only ever modify your local unpublished commits to account for new commits from the origin branch. If your change and the new commits from origin aren't touching the same lines or files, then the update is seamless [1].
If there is a conflict because your change and a new commit from origin both changed the same line in the same file, this can result in a merge commit that you need to resolve. But you resolve this locally, by updating your (unpushed, local) commit(s) to account for the new history. When you complete the merge, it will not show up as a merge commit in the repository -- you will simply have simply amended your local unpublished commit(s) only to account for the changes in upstream, and you will have seen every conflict and resolved each one yourself. When the process is complete, and you've resolved the merge, you'll have a nice linear branch where your commit(s) are on the tip of the origin branch.
The flag "--ff-only" basically just means "refuse to start a local merge while performing this operation, and instead fail".
Because of the potential for these merge conflicts, it's a best practice to "git pull" frequently, so that if there are conflicts you can deal with them incrementally (and possibly discuss the software design with coworkers/project partners if the design is beginning to diverge -- and so people can coordinate what they're working on and try to stay out of each other's way), instead of working through a massive pile of conflicts at the end of your "local 'branch'" (i.e. the code in your personal Git repo that constitutes unpushed commits to the upstream branch).
Additionally, all of the central Git repositories I've worked in in a professional context were also configured to disallow "git push --force" (a command that can rewrite history), for all "real" branches. These systems gave users their own private Git repo sandbox (just like you can fork a repo on GitHub) where they can force push if they want to (but is backed up centrally like GitHub to avoid the loss of work). This personal repo was very useful for saving my work. I was in the habit of committing and pushing to the private repo about once every 30m-1h (to eliminate the chance of major work loss due to hardware failure). Almost always I'd squash all of these commits into one before rebasing onto master, so that the change comes in as a single commit, unless it would be too large to review.
In the occasional circumstances where I've legitimately needed to rewrite history for some reason -- say credentials got into the repository, or someone generated one of these "merge master into master" commits -- then I would change HEAD from master to another branch, delete master, and then recreate it with the desired state. (And even that operation would show up in the system's logs, so in the case of something like credentials you'd additionally contact the security or source control team to make sure the commit objects containing the credentials were actually deleted out of the repo history completely, including stuff you can find only via the reflog.) Then contact the team working on the repo to let them know that you had to rewrite history to correct a problem.
I would recommend disabling "git push --force" for all collaborative projects. If you're operating the repository, you can do this by invoking "git config --system receive" and setting "denyNonFastForwards true". In GitHub there's probably a repository setting switch somewhere.
Once professional software engineers start working with Git all day long, they quickly get past the beginner stage and the need to do this kind of stuff is very rare.
[1] It doesn't mean the software will work though, even if both changes would have worked in isolation. You still need to inspect and test the results of "git pull (--ff-only)", since even if there are no conflicts like commits that modify the same lines as yours, or there are conflicts that Git can resolve automatically, it's possible for the resultant software logic to be defective, since Git has no semantic understanding of the code.
`git pull --rebase` usually is what I need to do. To save local changes and rebase to the git remote's branch:
# git branch -av;
# git remote -v;
# git reflog; git help reflog; man git-reflog
# git show HEAD@{0}
# git log -n5 --graph;
git add -A; git status;
git stash; git stash list;
git pull --rebase;
#git pull --rebase origin develop
# git fetch origin develop
# git rebase origin/develop
git stash pop;
git stash list;
git status;
# git commit
# git rebase -i HEAD~5 # squash
# git push
HubFlow does branch merging correctly because I never can. Even when it's just me and I don't remember how I was handling tags of releases on which branch, I just reach for HubFlow now and it's pretty much good.There's a way to default to --rebase for pulls: is there a reason not to set that in a global gitconfig? Edit: From https://stackoverflow.com/questions/13846300/how-to-make-git... :
> There are now 3 different levels of configuration for default pull behaviour. From most general to most fine grained they are: […]
git config --global pull.rebase trueMaybe start here: http://learngitbranching.js.org/
GitHub Learning Lab: https://lab.github.com/
https://learnxinyminutes.com/docs/git/ #Further_resources
A future for SQL on the web
What's kind of bonkers here is that IndexedDB uses sqlite as its backend. So, this is sqlite (WASM) -> IndexedDB -> sqlite (native).
The Internet is a wild place...
I'm going to build a business that offers SQLite as a web service. It will be backed by a P2P network of browser instances storing data in IndexedDB. Taking investment now.
TIL, about Graph "Protocol for building decentralized applications quickly on Ethereum" https://github.com/graphprotocol
https://thegraph.com/docs/indexing
> Indexers are node operators in The Graph Network that stake Graph Tokens (GRT) in order to provide indexing and query processing services. Indexers earn query fees and indexing rewards for their services. They also earn from a Rebate Pool that is shared with all network contributors proportional to their work, following the Cobbs-Douglas Rebate Function.
> GRT that is staked in the protocol is subject to a thawing period and can be slashed if Indexers are malicious and serve incorrect data to applications or if they index incorrectly. Indexers can also be delegated stake from Delegators, to contribute to the network.
> Indexers select subgraphs to index based on the subgraph’s curation signal, where Curators stake GRT in order to indicate which subgraphs are high-quality and should be prioritized. Consumers (eg. applications) can also set parameters for which Indexers process queries for their subgraphs and set preferences for query fee pricing.
It's Ethereum though, so it's LevelDB, not SQLite on IndexedDB on SQLite.
Show HN: Python Source Code Refactoring Toolkit via AST
A similar type project. Though I haven't seen much activity recently:
I haven't used bowler, but from what I can see it is using lib2to3 (through fissix), which can't parse newer Python code (parenthesized context managers, new pattern matching statement etc.) due to it being using a LL(1) parser. The regular ast on the other hand is always compatible with the current grammar, since that is what CPython internally consumes. It is also more handy to work with since majority of the linters already using it, so it would be very easy to port rules.
(Author of Bowler/maintainer of fissix)
The main concern with using the ast module from stdlib is that you lose all formatting/whitespace/comments in your syntax tree, so writing any changes back to the original sources requires doing a lot more extra work to preserve the original formatting, put back comments, etc. This is entire point of lib2to3/fissix and LibCST, allowing large scale CST manipulation while preserving all of those comments and formatting. We do recognize the limitations of lib2to3/fissix, though, so there have been some backburner plans to move Bowler onto LibCST, as well as building a PEG-based parser for LibCST specifically to enable support for 3.10 and future syntax/grammar changes. But of course, this is very difficult to give any ETA or target for release.
Did you consider PyCQA/RedBaron (which is based upon PyCQA/baron, an AST implementation which preserves comments and whitespace)? https://redbaron.readthedocs.io/en/latest/
It was considered, but the initial goals of Bowler was to a) build on top of lib2to3 in an attempt to get broader upstream support for maintaining it as an "official" cst module, which fizzled out, and b) to use some of the more complex matching semantics that lib2to3 enabled, which baron and other cst alternatives don't really attempt to cover.
Things like "find any function that uses kwargs" or "find any class that defines a method named `bar`" can be easily expressed with lib2to3's matching grammar, and no other CST that I'm aware of (that isn't itself based on lib2to3) has equivalent functionality. This is something we wanted to add to LibCST, but haven't had the time to focus on given other priorities. Meanwhile, we used LibCST to write a safer alternative to isort: https://usort.readthedocs.io
Rog. I think CodeQL (GitHub acquired Semmle and QL in 2019) supports those types of queries; probably atop lib2to3 as well. https://codeql.github.com/docs/writing-codeql-queries/introd...
From https://news.ycombinator.com/item?id=24511280 :
> Additional lists of static analysis, dynamic analysis, SAST, DAST, and other source code analysis tools […]
Emacs' org-mode gets citation support
FWIW, Jupyter-book handles Citations and bibliographies with sphinxcontrib-bibtex: https://jupyterbook.org/content/citations.html
Some notes about Zotero and Schema.org RDFa for publishing [CSL with citeproc] citations: references of Linked Data resources in a graph, with URIs all: https://wrdrd.github.io/docs/tools/index#zotero-and-schema-o...
Compared to trying to parse beautifully typeset bibliographies in PDFs built from LaTeX with a Computer Modern font, search engines can more easily index e.g. https://schema.org/ScholarlyArticle linked data as RDFa, Microdata, or JSON-LD.
Scholarly search engines: Google Scholar, Semantic Scholar, Meta.org,
NSA Kubernetes Hardening Guidance [pdf]
- Scan containers and Pods for vulnerabilities or misconfigurations.
- Run containers and Pods with the least privileges possible.
- Use network separation to control the amount of damage a compromise can cause.
- Use firewalls to limit unneeded network connectivity and encryption to protect confidentiality.
- Use strong authentication and authorization to limit user and administrator access as well as to limit the attack surface.
- Use log auditing so that administrators can monitor activity and be alerted to potential malicious activity.
- Periodically review all Kubernetes settings and use vulnerability scans to help ensure risks are appropriately accounted for and security patches are applied.
> and encryption to protect confidentiality
Probably the hardest part about this. Private networks with private domains. Who runs the private CA, updates DNS records, issues certs, revokes, keeps the keys secure, embeds the keychain in every container's cert stack, and enforces validation?
That is a shit-ton of stuff to set up (and potentially screw up) which will take a small team probably months to complete. How many teams are actually going to do this, versus just terminating at the load balancer and everything in the cluster running plaintext?
> That is a shit-ton of stuff to set up (and potentially screw up) which will take a small team probably months to complete.
Agree! This is why that "Kubernetes Hardening Guidance" is for NSA, not for startups.
Resource needs aside, keeping basic AppSec/InfoSec hygiene is a strong recommendation. Also there are tons of startups that are trying to provide solutions/services to solve that also. A lot of times, it's worth the money.
This guidance is provided by the NSA, not for the NSA.
From the doc:
>It includes hardening strategies to avoid common misconfigurations and guide system administrators and developers of National Security Systems on how to deploy Kubernetes...
Also:
> Purpose > NSA and CISA developed this document in furtherance of their respective cybersecurity missions, including their responsibilities to develop and issue cybersecurity specifications and mitigations. This information may be shared broadly to reach all appropriate stakeholders.
NSA has multiple mandates and many stakeholders.
Looks like there's actually a "summary of the key recommendations from each section" on page 2.
> Works cited:
> [1] Center for Internet Security, "Kubernetes," 2021. [Online]. Available: https://cisecurity.org/resources/?type=benchmark&search=kube... .
> [2] DISA, "Kubernetes STIG," 2021. [Online]. Available: https://dl.dod.cyber.mil.wp- content/uploads/stigs/zip/U_Kubernetes_V1R1_STIG.zip. [Accessed 8 July 2021]
> [3] The Linux Foundation, "Kubernetes Documentation," 2021. [Online]. Available: https://kubernetes.io/docs/home/ . [Accessed 8 July 2021].
> [4] The Linux Foundation, "11 Ways (Not) to Get Hacked," 18 07 2018. [Online]. Available: https://kubernetes.io/blog/2018/07/18/11-ways-not-to-get-hac... . [Accessed 8 July 2021].
> [5] MITRE, "Unsecured Credentials: Cloud Instance Metadata API." MITRE ATT&CK, 2021. [Online]. Available: https://attack.mitre.org/techniques/T1552/005/. [Accessed 8 July 2021].
> [6] CISA, "Analysis Report (AR21-013A): Strengthening Security Configurations to Defend Against Attackers Targeting Cloud Services." Cybersecurity and Infrastructure Security Agency, 14 January 2021. [Online]. Available:https://us- cert.cisa.gov/ncas/analysis-reports/ar21-013a [Accessed 8 July 2021].
How can k8s and zero-trust cooccur?
> CISA encourages administrators and organizations review NSA’s guidance on Embracing a Zero Trust Security Model to help secure sensitive data, systems, and services.
"Embracing a Zero Trust Security Model" (2021, as well) https://media.defense.gov/2021/Feb/25/2002588479/-1/-1/0/CSI...
In addition to "zero [trust]", I also looked for the term "SBOM". From p.32//39:
> As updates are deployed, administrators should also keep up with removing any old components that are no longer needed from the environment. Using a managed Kubernetes service can help to automate upgrades and patches for Kubernetes, operating systems, and networking protocols. *However, administrators must still patch and upgrade their containerized applications.*
"Existing artifact vuln scanners, databases, and specs?" https://github.com/google/osv/issues/55
Hosting SQLite Databases on GitHub Pages
That's one lovely trick.
If I may suggest one thing... instead of range requests on a single huge file how about splitting the file in 1-page fragments in separate files and fetching them individually? This buys you caching (e.g. on CDNs) and compression (that you could also perform ahead of time), both things that are somewhat tricky with a single giant file and range requests.
With the reduction in size you get from compression, you can also use larger pages almost for free, potentially further decreasing the number of roundtrips.
There's also a bunch of other things that could be tried later, like using a custom dictionary for compressing the individual pages.
I think that's the core innovation here, smart HTTP block storage.
I wonder if there has been any research into optimizing all http range requests at the client level in a similar way. i.e. considering the history of requests on a particular url and doing the same predictive exponential requests, or grabbing the full file asynchronously at a certain point.
> Methods for remotely accessing/paging data in from a client when a complete download of the dataset is unnecessary:
> - Query e.g. parquet on e.g. GitHub with DuckDB: duckdb/test_parquet_remote.test https://github.com/duckdb/duckdb/blob/6c7c9805fdf1604039ebed...
> - Query sqlite on e.g. GitHub with SQLite: [Hosting SQLite databases on Github Pages - (or any static file hoster) - phiresky's blog](...)
>> The above query should do 10-20 GET requests, fetching a total of 130 - 270KiB, depending on if you ran the above demos as well. Note that it only has to do 20 requests and not 270 (as would be expected when fetching 270 KiB with 1 KiB at a time). That’s because I implemented a pre-fetching system that tries to detect access patterns through three separate virtual read heads and exponentially increases the request size for sequential reads. This means that index scans or table scans reading more than a few KiB of data will only cause a number of requests that is logarithmic in the total byte length of the scan. You can see the effect of this by looking at the “Access pattern” column in the page read log above.
> - bittorrent/sqltorrent https://github.com/bittorrent/sqltorrent
>> Sqltorrent is a custom VFS for sqlite which allows applications to query an sqlite database contained within a torrent. Queries can be processed immediately after the database has been opened, even though the database file is still being downloaded. Pieces of the file which are required to complete a query are prioritized so that queries complete reasonably quickly even if only a small fraction of the whole database has been downloaded.
>> […] Creating torrents: Sqltorrent currently only supports torrents containing a single sqlite database file. For efficiency the piece size of the torrent should be kept fairly small, around 32KB. It is also recommended to set the page size equal to the piece size when creating the sqlite database
Would BitTorrent be faster over HTTP/3 (UDP) or is that already a thing for web seeding?
> - https://web.dev/file-system-access/
> The File System Access API: simplifying access to local files: The File System Access API allows web apps to read or save changes directly to files and folders on the user’s device
Hadn't seen wilsonzlin/edgesearch, thx:
> Serverless full-text search with Cloudflare Workers, WebAssembly, and Roaring Bitmaps https://github.com/wilsonzlin/edgesearch
>> How it works: Edgesearch builds a reverse index by mapping terms to a compressed bit set (using Roaring Bitmaps) of IDs of documents containing the term, and creates a custom worker script and data to upload to Cloudflare Workers
> Would BitTorrent be faster over HTTP/3 (UDP) or is that already a thing for web seeding?
The BT protocol itself runs on both TCP and UDP, but it has preferred the UDP variant for many years already.
Thanks. There likely are relative advantages to HTTP/3 QUIC. Here's this from Wikipedia:
> Both HTTP/1.1 and HTTP/2 use TCP as their transport. HTTP/3 uses QUIC, a transport layer network protocol which uses user space congestion control over the User Datagram Protocol (UDP). The switch to QUIC aims to fix a major problem of HTTP/2 called "head-of-line blocking": because the parallel nature of HTTP/2's multiplexing is not visible to TCP's loss recovery mechanisms, a lost or reordered packet causes all active transactions to experience a stall regardless of whether that transaction was impacted by the lost packet. Because QUIC provides native multiplexing, lost packets only impact the streams where data has been lost.
And HTTP Pipelining / Multiplexing isn't specified by just UDP or QUIC:
> HTTP/1.1 specification requires servers to respond to pipelined requests correctly, sending back non-pipelined but valid responses even if server does not support HTTP pipelining. Despite this requirement, many legacy HTTP/1.1 servers do not support pipelining correctly, forcing most HTTP clients to not use HTTP pipelining in practice.
> Time diagram of non-pipelined vs. pipelined connection The technique was superseded by multiplexing via HTTP/2,[2] which is supported by most modern browsers.[3]
> In HTTP/3, the multiplexing is accomplished through the new underlying QUIC transport protocol, which replaces TCP. This further reduces loading time, as there is no head-of-line blocking anymore https://en.wikipedia.org/wiki/HTTP_pipelining
Ask HN: Any good resources on how to be a great technical advisor to startups?
Bumping up https://news.ycombinator.com/item?id=27600539
## Codelabels: Component: title
### ENH,UBY: HN: linkify URIs in descriptions
## User Stories
Users {__, __, } can ___ in order to ___.
Given-When-Then
~ Who-What-Wow
~ {Marketing, Training, Support, Service} Curriculum Competencies
### Users can click on links in descriptions in order to review referenced off-site resources.
Costs/Benefits: Linkspam?
The URL from this {item,} description: https://news.ycombinator.com/item?id=27600539
Teaching other teachers how to teach CS better
git and HTML and Linked Data should be requisite: https://learngitbranching.js.org/
Pedagogy#Modern_pedagogy: https://en.wikipedia.org/wiki/Pedagogy#Modern_pedagogy
Evidence-based_education: https://en.wikipedia.org/wiki/Evidence-based_education
Computational_thinking#Characteristics: https://en.wikipedia.org/wiki/Computational_thinking#Charact... (Abstraction, Automation, Analysis)
Learning: https://en.wikipedia.org/wiki/Learning
Autodidacticism: https://en.wikipedia.org/wiki/Autodidacticism
Design of Experiments; Hypotheses, troubleshooting, debugging, automated testing, Formal Methods, actual Root Cause Analysis: https://en.wikipedia.org/wiki/Design_of_experiments
Critical Thinking; definitions, Logic and Rationality, Logical Reasoning: Deduction, Abduction and Induction: https://en.wikipedia.org/wiki/Critical_thinking#Logic_and_ra...
Doesn't this all derive from [Quantum] Information Theory? It's actually fascinating to start at Information Theory; who knows what that curriculum would look like without reinforcement and [3D] videos: https://en.wikipedia.org/wiki/Information_theory
Stone, James V. "Information theory: a tutorial introduction." (2015). https://scholar.google.com/scholar?q=%22Information+Theory:+...
It used to be that we had to start engines with a turn of a crank: that initial energy to overcome inertia was enough for the system to feed-forward without additional reinforcement. Effective CS instruction may motivate the unmotivated to care about learning the way folks who are receiving reinforcement do: intrinsically.
Ask HN: Best online speech / public speaking course?
Hi HN - Has anyone taken an online course to help them with public speaking, speech and voice skills that they’d highly recommend? Thanks!
"TED Talks: The Official TED Guide to Public Speaking" https://smile.amazon.com/TED-Talks-Official-Public-Speaking-...
TED Masterclass: https://masterclass.ted.com/
"Power Talk: Using Language to Build Authority and Influence" https://smile.amazon.com/Power-Talk-Language-Authority-Influ...
Re: Clean Language and Symbolic Modeling; listening to metaphors and asking clean questions may be a more effective way to facilitate change: https://westurner.github.io/hnlog/#comment-15471868
/? greatest speeches: https://m.youtube.com/results?sp=mAEA&search_query=Greatest+...
"Lend Me Your Ears: Great Speeches in History" by William Safire. https://a.co/8svyoUw
E.g. "The Prosperity Bible: The Greatest Writings of All Time on the Secrets to Wealth and Prosperity" (Napoleon Hill, PT Barnum, Dale Carnegie, Gibran, Benjamin Franklin; 5000+ pages). https://a.co/b8Ej6o7
Talking points: Peaceful coexistence, #GlobalGoals 1-17 (UN SDGs), "Limits to Growth: The 30-Year Update" by Donella H. Meadows. https://a.co/7MgO0bv
Google sunsets the APK format for new Android apps
I was just trying to explain this the other day. Not sure whether to be disappointed in is this a regression? No, bros, you may not just `repack it` and re-sign the package for me. That's not how it should work unless I trust their build server to sign for me; and I don't and we shouldn't. I'll just CC this here from https://westurner.github.io/hnlog/#comment-27410978 :
```
> Unfortunately all packages aren't necessarily signed either; "Which package managers require packages to be cryptographically signed?" is similar to "Which DNS clients can operate DNS resolvers that require DNSSEC signatures on DNS records to validate against the distributed trust anchors?".
> FWIW, `delv pkg.mirror.server.org` is how you can check DNSSEC:
man systemd-resolved # nmcli
man delv
man dnssec-trust-anchors.d
delv pkg.mirror.server.org
> Sigstore is a free and open Linux Foundation service for asset signatures: https://sigstore.dev/what_is_sigstore/ > The TUF Overview explains some of the risks of asset signature systems; key compromise, there's one key for everything that we all share and can't log the revocation of in a CT (Certificate Transparency) log distributed like a DLT, https://theupdateframework.io/overview/
> Certificate Transparency: https://en.wikipedia.org/wiki/Certificate_Transparency
> Yeah, there's a channel to secure there at that layer of the software supply chain as well.
> "PEP 480 -- Surviving a Compromise of PyPI: End-to-end signing of packages" (2014-) https://www.python.org/dev/peps/pep-0480/
>> Proposed is an extension to PEP 458 that adds support for end-to-end signing and the maximum security model. End-to-end signing allows both PyPI and developers to sign for the distributions that are downloaded by clients. The minimum security model proposed by PEP 458 supports continuous delivery of distributions (because they are signed by online keys), but that model does not protect distributions in the event that PyPI is compromised. In the minimum security model, attackers who have compromised the signing keys stored on PyPI Infrastructure may sign for malicious distributions. The maximum security model, described in this PEP, retains the benefits of PEP 458 (e.g., immediate availability of distributions that are uploaded to PyPI), but additionally ensures that end-users are not at risk of installing forged software if PyPI is compromised.
> One W3C Linked Data way to handle https://schema.org/SoftwareApplication ( https://codemeta.github.io/user-guide/ ) cryptographic signatures of a JSON-LD manifest with per-file and whole package hashes would be with e.g. W3C ld-signatures/ld-proofs and W3C DID (Decentralized Identifiers) or x.509 certs in a CT log.
```
FWIU, the Fuschia team is building package signing on top of TUF.
A from-scratch tour of Bitcoin in Python
This is reminds me of Ken Shirriff's 2014 "Bitcoins the Hard Way" blog post that also used Python to build a Bitcoin transaction from scratch: http://www.righto.com/2014/02/bitcoins-hard-way-using-raw-bi...
(The subtitle of the blog is "Computer history, restoring vintage computers, IC reverse engineering, and whatever" and it is full of fascinating articles, several of which have been featured here on HN)
> The 'dumbcoin' jupyter notebook is also a good reference: "Dumbcoin - An educational python implementation of a bitcoin-like blockchain" https://nbviewer.jupyter.org/github/julienr/ipynb_playground...
https://github.com/yjjnls/awesome-blockchain#implementation-... and https://github.com/openblockchains/awesome-blockchains#pytho... list a few more ~"blockchain from scratch" [in Python] examples.
... FWIU, Ethereum has the better Python story. There was a reference implementation of Ethereum in Python? https://ethereum.org/en/developers/docs/programming-language...
An Omega-3 that’s poison for cancer tumors
If you are interested in adding Omega-3 to your stack, make sure it is molecularly distilled fish oil. Otherwise there is the risk of excess lead and mercury.
You can also eat fish that are low on the food chain like sardines.
Fish don't synthesize Omega PUFAs, they eat algae (which unfortunately and inopportunely stains teeth)
From "Warning: Combination of Omega-3s in Popular Supplements May Blunt Heart Benefits" https://scitechdaily.com/warning-combination-of-omega-3s-in-... :
> Now, new research from the Intermountain Healthcare Heart Institute in Salt Lake City finds that higher EPA blood levels alone lowered the risk of major cardiac events and death in patients, while DHA blunted the cardiovascular benefits of EPA. Higher DHA levels at any level of EPA, worsened health outcomes.
> Results of the Intermountain study, which examined nearly 1,000 patients over a 10-year-period,
> “Based on these and other findings, we can still tell our patients to eat Omega-3 rich foods, but we should not be recommending them in pill form as supplements or even as combined (EPA + DHA) prescription products,” he said. “Our data adds further strength to the findings of the recent REDUCE-IT (2018) study that EPA-only prescription products reduce heart disease events.”
Now they're sayin'; so I go look for an EPA-only supplement, and TIL about re-esterified triglyceride and it says it's molecularly distilled anchovies in blister packages. Which early land mammals probably ate, so.
I found this comment too late... Already ordered a "kids" supplement that has a 2:1 ratio of DHA:EPA.
I do not think that you should be much concerned about this, because other studies show that more DHA than EPA is preferable when using other criteria.
So the conclusion is that nobody knows for sure if you should eat more DHA than EPA, to have a better brain, or less DHA than EPA, to have a better heart.
What is certain is that you need some minimum quantity of both DHA and EPA.
Especially for children, who do not have yet to worry about cardio-vascular problems, a supplement with more DHA than EPA seems actually a good choice.
Discover and Prevent Linux Kernel Zero-Day Exploit Using Formal Verification
[Coq, VST, CompCert]
Formal methods: https://en.wikipedia.org/wiki/Formal_methods
Formal specification: https://en.wikipedia.org/wiki/Formal_specification
Implementation of formal specification: https://en.wikipedia.org/wiki/Anti-pattern#Software_engineer...
Formal verification: https://en.wikipedia.org/wiki/Formal_verification
From "Why Don't People Use Formal Methods?" https://news.ycombinator.com/item?id=18965964 :
> Which universities teach formal methods?
> - q=formal+verification https://www.class-central.com/search?q=formal+verification
> - q=formal+methods https://www.class-central.com/search?q=formal+methods
> Is formal verification a required course or curriculum competency for any Computer Science or Software Engineering / Computer Engineering degree programs?
Can there still be side channel attacks in formally verified systems? Can e.g. TLA+ help with that at all?
Formal methods could be used to prevent side channel attacks. However preventing each type of attack requires specifying properties which needs to be proven to assure it is is not possible. I am not very famililiar with TLA+, but I think in general it is not guaranteed to provide side channel attacks, such as based on timing, CPU state, etc.
Anatomy of a Linux DNS Lookup
This predates systemd's decision to get involved via systemd-resolved. So now it's got another step :)
Edit: Well, and also browsers doing DNS over https. It's pretty confusing these days to know which path your app took to resolve a name.
Which is super annoying when you actually want to run a DNS server and have to try and convince systemd-resolved to stay in its own lane.
Is there a good example of a Linux package that does this correctly?
I suspect the big part is putting "DNSStubListener=no" into the [Resolve] section of /etc/systemd/resolved.conf if you want to have something else listening on port 53.
Or you can just mask the service (better than disabling):
systemctl mask systemd-resolved.service
Other services to consider masking if you are not using SystemD for network configuration and time sync., etc.: systemctl mask systemd-hostnamed.service
systemctl mask systemd-timesyncd.service
systemctl mask systemd-networkd.service
systemctl mask systemd-homed.serviceYeah, but if you regress to 'legacy DNS' by removing systemd-resolved then there's no good way to do per-interface DNS (~client-split DNS), or (optionally) validate DNSSEC, or do DoH/DoT; and then nothing respawns and logs consistently-timestamped process events of substitute network service processes.
FWIU, per-user DNS configs are still elusive. Per-user DNS would make it easier to use family-safe DNS (that redirects to family-safe e.g. SafeSearch domains) by default; some forums are essential for system administration.
Since nothing in the real world is or ever will be DNSSEC signed anyways, it's not that much of a loss. Meanwhile, the most important DoH user --- your browser --- has DoH baked in and doesn't need systemd-resolved's help.
Your system may also depend upon one or more package managers that do all depend upon DNS (and hopefully e.g. DNSSEC and DoH/DoT)
Package managers don't rely on DNSSEC for package security (it would be deeply problematic if they did). But you can also just go look and see that, for instance, Ubuntu.com isn't signed, nor are most of the Pacman mirrors for Arch, nor is alpinelinux.org, &c. DNSSEC is simply not a factor in package management security.
People have weird ideas about how this stuff works.
Unfortunately all packages aren't necessarily signed either; "Which package managers require packages to be cryptographically signed?" is similar to "Which DNS clients can operate DNS resolvers that require DNSSEC signatures on DNS records to validate against the distributed trust anchors?".
FWIW, `delv pkg.mirror.server.org` is how you can check DNSSEC:
man systemd-resolved # nmcli
man delv
man dnssec-trust-anchors.d
delv pkg.mirror.server.org
Sigstore is a free and open Linux Foundation service for asset signatures:
https://sigstore.dev/what_is_sigstore/The TUF Overview explains some of the risks of asset signature systems; key compromise, there's one key for everything that we all share and can't log the revocation of in a CT (Certificate Transparency) log distributed like a DLT, https://theupdateframework.io/overview/
Certificate Transparency: https://en.wikipedia.org/wiki/Certificate_Transparency
Yeah, there's a channel to secure there at that layer of the software supply chain as well.
"PEP 480 -- Surviving a Compromise of PyPI: End-to-end signing of packages" (2014-) https://www.python.org/dev/peps/pep-0480/
> Proposed is an extension to PEP 458 that adds support for end-to-end signing and the maximum security model. End-to-end signing allows both PyPI and developers to sign for the distributions that are downloaded by clients. The minimum security model proposed by PEP 458 supports continuous delivery of distributions (because they are signed by online keys), but that model does not protect distributions in the event that PyPI is compromised. In the minimum security model, attackers who have compromised the signing keys stored on PyPI Infrastructure may sign for malicious distributions. The maximum security model, described in this PEP, retains the benefits of PEP 458 (e.g., immediate availability of distributions that are uploaded to PyPI), but additionally ensures that end-users are not at risk of installing forged software if PyPI is compromised.
One W3C Linked Data way to handle https://schema.org/SoftwareApplication ( https://codemeta.github.io/user-guide/ ) cryptographic signatures of a JSON-LD manifest with per-file and whole package hashes would be with e.g. W3C ld-signatures/ld-proofs and W3C DID (Decentralized Identifiers) or x.509 certs in a CT log.
JupyterLite – WASM-powered Jupyter running in the browser
Despite Pyolite has a miserable performance (20MB of downloads), the overall project direction is correct.
I said this already 10 years ago: We don't need more cloud computing but need to empower users end devices again. Jupyter is typically operated on powerful notebooks and not on mobile devices.
Isn't that not just Java all over again, but this time with JavaScript?
There are crucial differences between Java applets and JS.
- Applets tried to render their own GUI, Wasm doesn't and defers to the browser.
- applets needed a big, slow to start and resource hungry VM. Wasm is running in the same thread your JS is also running in, it's light, and loads faster than JS
- Java and flash were plugins, which needed to be installed and kept up to date separately. Wasm is baked into your browser's JS engine
- Wasm code is very fast and can achieve near native execution speeds. It can make use of advanced optimisations. SIMD has shipped in Chrome, and will soon in Firefox
- The wasm spec is very, very good, and really quite small. This means that implementing it is comparatively cheap, and this should make it easy to see it implemented by different vendors.
- Java was just Java. Wasm can serve as a platform for any language. See my earlier point about the spec
So it's apples and oranges. The need to have something besides JS hasn't gone away, so their use cases might be similar. The two technologies couldn't be more distinct, though.
You must view the browser with JS and WASM as a unit.
The browser renders it's own GUI too, it's not OS native
The browser uses lots of resources too.
The browser is kind of a plugin to the OS and must be updated separately.
Java nowadays is pretty fast too.
Java VM serves a platform for multiple languages like Scala, Kotlin, Clojure.
Let's face it, the browser is the new JVM and a soon it gets the same permissions like the JVM to access the file system and such, we get the same problems.
From https://news.ycombinator.com/item?id=24052393 re: Starboard:
> https://developer.mozilla.org/en-US/docs/Web/Security/Subres... : "Subresource Integrity (SRI) is a security feature that enables browsers to verify that resources they fetch (for example, from a CDN) are delivered without unexpected manipulation. It works by allowing you to provide a cryptographic hash that a fetched resource must match."
> There's a new Native Filesystem API: "The new Native File System API allows web apps to read or save changes directly to files and folders on the user's device." https://web.dev/native-file-system/
> We'll need a way to grant specific URLs specific, limited amounts of storage.
[...]
> https://github.com/deathbeds/jyve/issues/46 :
> Would [Micromamba] and conda-forge build a WASM architecture target?
Accenture, GitHub, Microsoft and ThoughtWorks Launch the GSF
> With data centers around the world accounting for 1% of global electricity demand, and projections to consume 3-8% in the next decade, it’s imperative we address this as an industry.
> To help in that endeavor, we’re excited to announce the formation of The Green Software Foundation – a nonprofit founded by Accenture, GitHub, Microsoft and ThoughtWorks established with the Linux Foundation and the Joint Development Foundation Projects LLC to build a trusted ecosystem of people, standards, tooling and leading practices for building green software. The Green Software Foundation was born out of a mutual desire and need to collaborate across the software industry. Organizations with a shared commitment to sustainability and an interest in green software development principles are encouraged to join the foundation to help grow the field of green software engineering, contribute to standards for the industry, and work together to reduce the carbon emissions of software. The foundation aims to help the software industry contribute to the information and communications technology sector’s broader targets for reducing greenhouse gas emissions by 45% by 2030, in line with the Paris Climate Agreement.
Here's to now hand-optimized efficient EC, SHA-256, SHA-3, and Scrypt routines due to incentives. See also The Crypto Climate Accord, which is also inspired by the Paris Agreement: https://cryptoclimate.org/
... "Thermodynamics of Computation Wiki" https://news.ycombinator.com/item?id=18146854
Is 100% offset by PPAs always 200% Green?
From "Ask HN: What jobs can a software engineer take to tackle climate change?" https://news.ycombinator.com/item?id=20015801 :
> [ ] We should create some sort of a badge and structured data (JSONLD, RDFa, Microdata) for site headers and/or footers that lets consumers know that we're working toward '200% green' so that we can vote with our money.
>inspired by the Paris Agreement
So we’re just going to let China do the polluting? Maybe you didn’t know but the Paris Agreement exempted China from all standards.
No, under the Paris Agreement, countries set voluntary targets for themselves and regularly reassess.
And when China says it’s doing nothing everybody cheered.
TBF, the glut of [Chinese,] solar panels has significantly helped lower the cost of renewables; which is in everyone's interest.
They’re cheap because the secret ingredient is slavery and Uighur blood.
Rocky Linux releases its first release candidate
One of our clients pushed us toward AlmaLinux instead of CentOS 7. At the time, we vaguely suggested Rocky Linux, but noted we were kind of still waiting for the future (hence our suggestion of going with CentOS 7 and migrating later to the "true successor" of CentOS).
What is the HN community's perception of this? Is AlmaLinux a good choice? Do you believe Rocky Linux will succeed?
I’d say “neither”. CentOS would have fail had Redhat not taken over. My guess is that supporting all the companies who do not want to pay Redhat, for free, is going to push developers way pretty quickly.
Even if I’m wrong I’d still want to see at least two or three solid release before using either in production.
Just use Redhat or switch to Ubuntu. Then you can re-evaluate in five tears time.
> I’d say “neither”. CentOS would have fail had Redhat not taken over.
While it's possible that CentOS would have failed (I'm not sure slow initial releases actually indicate that), it also wasn't the only popular clone of RHEL in popular use. Scientific Linux (developed by FermiLab) was also popular and in fairly wide use. I doubt Red Hat would have made the change to CentOS that spurred all this if Scientific Linux hadn't decided not to continue and just use CentOS instead in 2019, because instead of a "crap, we have to start a new distro to provide what CentOS did" movement there would have been a much quicker and easier mass exodus to Scientific Linux.
The Scientific Linux team decided to not make a Scientific Linux 8, and instead switch over to CentOS 8[0]. Note this was announced prior to the announcement about CentOS 8 going away.
CERN et al are still deciding what to do[1].
[0]: https://listserv.fnal.gov/scripts/wa.exe?A2=SCIENTIFIC-LINUX...
[1]: https://linux.web.cern.ch/#update-on-centos-linux-strategy
USB-C is about to go from 100W to 240W, enough to power beefier laptops
What are the costs to add a USB PD module to an electronic device? https://hackaday.com/2021/04/21/easy-usb‑c-power-for-all-you...
- [ ] Create an industry standard interface for charging and using [power tool,] battery packs; and adapters
Half-Double: New hammering technique for DRAM Rowhammer bug
From "Rowhammer for qubits: is it possible?" https://amp.reddit.com/r/quantum/comments/7osud4/rowhammer_f... :
> Sometimes bits just flip due to "cosmic rays"; or, logically, also due to e.g. neutron beams and magnetic fields.
> With rowhammer, there are read/write (?) access patterns which cause predictable-enough information "leakage" to be useful for data exfiltration and privilege escalation.
> With the objective of modeling qubit interactions using quantum-mechanical properties of fields of electrons in e.g. DRAM, Is there a way to use DRAM electron "soft errors" to model quantum interactions; to build a quantum computer from what we currently see as errors in DRAM?
> If not with current DRAM, could one apply a magnetic field to DRAM in order to exploit quantum properties of electrons moving in a magnetic field?
https://en.wikipedia.org/wiki/DRAM
https://en.wikipedia.org/wiki/Row_hammer
https://en.wikipedia.org/wiki/Soft_error
https://en.wikipedia.org/wiki/Crosstalk
> [...] are there DRAM read/write patterns which cause errors due to interference which approximate quantum logic gates? Probably not, but maybe; especially with an applied magnetic field (which then isn't the DRAM sitting on our desks, it's then DRAM + a constant or variable field).
> I suppose to test this longshot theory, one would need to fuzz low-level RAM loads and search for outputs that look like quantum gate outputs. Or, monitor normal workloads which result in RAM faults which approximate quantum logic gate outputs and train a network to recognize the features.
> I am reminded of a recent approach to in-RAM computing that's not memristors.
> Soft errors caused by cosmic rays are obviously more frequent at higher altitudes (and outside of the Van Allen radiation belt).
Thought I'd ask this here as well.
Quantum tunneling was the perceived barrier at like DDR5 and higher densities FWIU? Barring new non-electron-based tech, how can we prevent adjacent electrons from just flipping at that gate grid gap size?
Other Quantum-on-Silicon approaches have coherence issues, too
Setting up a Raspberry Pi with 2 Network Interfaces as a simple router
Old but relevant, I have used an espressobin as a router for years and the performance I am getting is similar to much more expensive semi-professional routers. https://blog.tjll.net/building-my-perfect-router/
For those who wish to undergo lesser trouble to get similar performance/security there are number of Single Board Computers(SBCs) with OpenWrt support[1] sometimes with official support like those from Friendly Elec.
[1]https://openwrt.org/toh/views/toh_single-board-computers
[2]https://www.friendlyarm.com/index.php?route=product/category...
I'm constantly looking for good priced SBC with cellular connectivity - that chart shows only 2/3 with modems and all appear to be via external modules.
Anyone know of any?
> This page shows devices which have a LTE modem built in and are supported by OpenWrt.
https://openwrt.org/toh/views/toh_lte_modem_supported
It looks like this table is neither current nor complete though. And there's a different table of OpenWRT compatible devices that have a battery as well.
> [The Amarok (GL-X1200) Industrial IoT Gateway has] 2x SIM card slots for 2x 4G LTE modems (probably miniPCI-E so maybe upgradeable to 5G later), external antenna connectors for the LTE modems, MicroSD, #OpenWRT: https://store.gl-inet.com/collections/4g-smart-router/produc...
The Turris Omnia also has 4G LTE SIM card support (and LXC in their OpenWRT build). https://openwrt.org/toh/turris/turris_omnia
There's also a [Dockerized] x86 build of OpenWRT that probably also supports Mini PCI-E modules for 4G LTE, LoRa, and 5G. Route metrics determine which [gateway] route is tried first.
From "How much total throughput can your wi-fi router really provide?" https://news.ycombinator.com/item?id=26596395 :
> In 2021, most routers - even with OpenWRT and hardware-offloading - cannot actually push 1 Gigabit over wired Ethernet, though the port spec does say 1000 Mbps
What to do about GPU packages on PyPI?
“ Our current CDN “costs” are ~$1.5M/month and not getting smaller. This is generously supported by our CDN provider but is a liability for PyPI’s long-term existence.”
Wow
Bear in mind this isn't just end-users installing on their machines, it also includes continuous integration scripts that run quite frequently.
[Huge GPU] packages can be cached locally: persist ~/.cache/pip between builds with e.g. Docker, run a PyPI caching proxy,
"[Discussions on Python.org] [Packaging] Draft PEP: PyPI cost solutions: CI, mirrors, containers, and caching to scale" https://discuss.python.org/t/draft-pep-pypi-cost-solutions-c...
> Continuous Integration automated build and testing services can help reduce the costs of hosting PyPI by running local mirrors and advising clients in regards to how to efficiently re-build software hundreds or thousands of times a month without re-downloading everything from PyPI every time.
[...]
> Request from and advisory for CI Services and CI Implementors:
> Dear CI Service,
> - Please consider running local package mirrors and enabling use of local package mirrors by default for clients’ CI builds.
> - Please advise clients regarding more efficient containerized software build and test strategies.
> Running local package mirrors will save PyPI (the Python Package Index, a service maintained by PyPA, a group within the non-profit Python Software Foundation) generously donated resources. (At present (March 2020), PyPI costs ~ $800,000 USD a month to operate; even with generously donated resources).
Looks like the current figure is significantly higher than $800K/mo for science.
How to persist ~/.cache/pip between builds with e.g. Docker in order to minimize unnecessary GPU package re-downloads:
RUN --mount=type=cache,target=/root/.cache/pip
RUN --mount=type=cache,target=/home/appuser/.cache/pipWow, that RUN trick is exactly what I've been looking for! I've spent hours and hours in Docker documentation and hadn't seen that functionality.
Looks like it might be buildkit-specific?
Markdown Notes VS Code extension: Navigate notes with [[wiki-links]]
> Syntax highlighting for #tags.
What's the best way to search for #tags with VS Code? Are #tags indexed into an e.g. ctags file within a project or a directory?
> @bibtex-citations: Use pandoc-style citations in your notes (eg @author_title_year) to get syntax highlighting, autocompletion and go to definition, if you setup a global BibTeX file with your references.
Open VS Code search across files (Press Ctrl+Shift+F on windows) and type the tag that you are searching for: #tag
Simple as that https://code.visualstudio.com/docs/editor/codebasics#_search...
Ask HN: Choosing a language to learn for the heck of it
I'm a technical manager, which means I do a lot of administrative stuff and a little coding. The coding has become a nice distraction when I need to take a break.
For "real work" I write mostly Python, a lot of SQL, a little bit of Go, and some shell scripting to glue it together. I'd like to learn something I have no need of for work. If it becomes useful later, that is OK, but not a goal. The goal is in creating something just for fun. That something is undefined, so general purpose languages are the population.
I have become curious lately in Nim, Crystal, and Zig. Small, modern, high performance languages. Curiousity comes from the cases when they are mentioned here, sometime for similar reasons I list above.
Nim is on top of the list: Sort of Python like, supported on Windows (I use Win/Mac/Linux), appears to have libraries for the things I do: Process text for insights, play projects would use interesting data instead of business data.
Crystal does not support Windows (yet), but appears to closer to Ruby. Its performance may be a bit better.
Zig came on my radar recently, I know less about it, compared to the little I know of the others.
Suggestions on choosing one as a hobby language?
> Suggestions on choosing one as a hobby language?
IDK how much of a hobby it'd remain, but: Rust compiles to WASM, C++ now has auto and coroutines (and real live memory management)
"Ask HN: Is it worth it to learn C in 2020?" https://news.ycombinator.com/item?id=21878664
Show HN: Django SQL Dashboard
This looks great! In a similar vein - does anyone know of a project that will allow for a Django shell in the browser?
I know Jupyter exists - but a solution like this with the permissions would be valuable.
This launches the web-based Werkzeug debugger on Exception:
pip install django-extensions
python manage.py runserver_plus
https://django-extensions.readthedocs.io/en/latest/runserver...This should run IPython Notebook with database models already imported :
python manage.py shell_plus --notebook
But writing fixtures, tests and (celery / dask-labextension) tasks is probably the better way to do things. Django-rest-assured is one way to get a tested REST API with DRF and e.g. factory_boy for generating test data.Interactive IPA Chart
Is there a [Linked Data] resource with the information in this interactive IPA chart (which is from Wikipedia FWICS) in addition to?:
- phoneme, ns:"US English letter combinations", []
- phoneme, ns:"schema.org/CreativeWorks which feature said phoneme", []
AFAIU, WordNet RDF doesn't have links to any IPA RDFS/OWL vocabulary/ontology yet.
Google Dataset Search
Information on how to annotate datasets: https://developers.google.com/search/docs/data-types/dataset
> We can understand structured data in Web pages about datasets, using either schema.org Dataset markup, or equivalent structures represented in W3C's Data Catalog Vocabulary (DCAT) format. We also are exploring experimental support for structured data based on W3C CSVW, and expect to evolve and adapt our approach as best practices for dataset description emerge. For more information about our approach to dataset discovery, see Making it easier to discover datasets.
For more info on those:
- W3C's Data Catalog Vocabulary: https://www.w3.org/TR/vocab-dcat-3/
- Schema.org dataset: https://schema.org/Dataset
- CSVW Namespace Vocabulary Terms: https://www.w3.org/ns/csvw
- Generating RDF from Tabular Data on the Web (examples on how to use CSVW): https://www.w3.org/TR/csv2rdf/
Use cases for such [LD: Linked Data] metadata:
1. #StructuredPremises:
> (How do I indicate that this is a https://schema.org/ScholarlyArticle predicated upon premises including this Dataset and these logical propositions?)
2. #LinkedMetaAnalyses; #LinkedResearch "#StudyGraph"
3. [CSVW (Tabular Data Model),] schema.org/Dataset(s) with per column (per-feature) physical quantity and unit URIs with e.g. QUDT and/or https://schema.org/StructuredValue metadata for maximum data reusability.
4. JupyterLab notebooks:
4a. JupyterLab Metadata Service extension: https://github.com/jupyterlab/jupyterlab-metadata-service :
> - displays linked data about the resources you are interacting with in JuyterLab.
> - enables other extensions to register as linked data providers to expose JSON LD about an entity given the entity's URL.
> - exposes linked data to the user as a Linked Data viewer in the Data Browser pane.
4b. JupyterLab Data Explorer: https://github.com/jupyterlab/jupyterlab-data-explorer :
> - Data changing on you? Use RxJS observables to represent data over time.
> - Have a new way to look at your data? Create React or lumino components to view a certain type.
> - Built-in data explorer UI to find and use available datasets.
Ask HN: Cap Table Service Recommendations
Recent founders, do you have any recommendations for services for managing a cap table? Or do you do it yourself? Any suggestions for how to choose?
Here are the "409a valuation" reviews on FounderKit: https://founderkit.com/legal/409a-valuation/reviews
Hosting SQLite databases on GitHub Pages or any static file hoster
The innovation here is getting sql.js to use http and range requests for file access rather than all being in memory.
I wonder when people using next.js will start using this for faster builds for larger static sites?
See also https://github.com/bittorrent/sqltorrent, same trick but using BitTorrent
Yeah, that was one of the inspirations for this. That one does not work in the browser though, would be a good project to do that same thing but with sqlite in wasm and integrated with WebTorrent instead of a native torrent program.
I actually did also implement a similar thing fetching data on demand from WebTorrent (and in turn helping to host the data yourself by being on the website): https://phiresky.github.io/tv-show-ratings/ That uses a protobufs split into a hashmap instead of SQLite though.
This looks pretty efficient. Some chains can be interacted with without e.g. web3.js? LevelDB indexes aren't SQLite.
Datasette is one application for views of read-only SQLite dbs with out-of-band replication. https://github.com/simonw/datasette
There are a bunch of *-to-sqlite utilities in corresponding dogsheep project.
Arrow JS for 'paged' browser client access to DuckDB might be possible and faster but without full SQLite SQL compatibility and the SQLite test suite. https://arrow.apache.org/docs/js/
> Direct Parquet & CSV querying
In-browser notebooks like Pyodide and Jyve have local filesystem access with the new "Filesystem Access API", but downloading/copying all data to the browser for every run of a browser-hosted notebook may not be necessary. https://web.dev/file-system-access/
DuckDB can directly & selectively query Parquet files over HTTP/S3 as well. See here for examples: https://github.com/duckdb/duckdb/blob/6c7c9805fdf1604039ebed...
Wasm3 compiles itself (using LLVM/Clang compiled to WASM)
Self-hosting (compilers) https://en.wikipedia.org/wiki/Self-hosting_(compilers) :
> In computer programming, self-hosting is the use of a program as part of the toolchain or operating system that produces new versions of that same program—for example, a compiler that can compile its own source code
How big of a milestone is self-hosting for a language?
Semgrep: Semantic grep for code
Is there a more complete example of how to call semgrep from pre-commit (which gets called before every git commit) in order to prevent e.g. Python print calls (print(), print \\n(), etc.) from being checked in?
https://semgrep.dev/docs/extensions/ describes how to do pre-commit.
Nvm, here's semgrep's own .pre-commit-config.yml for semgrep itself: https://github.com/returntocorp/semgrep/blob/develop/.pre-co...
I've never used the `pre-commit` framework, but it's really simple to wire up arbitrary shell scripts; check out the
`.git/hooks` directory in your repo for samples, e.g. `.git/hooks/pre-commit.sample`.
You can run any old shell script there, without having to install a python tool.
Yeah but that githook will only be installed on that one repo on that one machine. And they may have no or a different version of bash installed (on e.g. MacOS or Windows). IMHO, POSIX-compatible portable shell scripts are more trouble than portable Python scripts.
Pre-commit requires Python and pre-commit to be installed (and then it downloads every hook function).
This fetches the latest version of every hook defined in the .pre-commit-config.yml:
pre-commit autoupdate
https://pre-commit.com/#pre-commit-autoupdateA person could easily `ln -s repo/.hooks/hook*.sh repo/.git/hooks/` after every git clone.
Out of curiosity, Is there value in doing this over (say) running a GitHub Action post commit and failing the build if it finds something nasty?
If you can catch it before the commit is even made then why do/wait for a build?
Fair enough. Guess IDE plugins work even better for that
IDE plugins are not at all consistent from one IDE to another. Pre-commit is great for teams with different IDEs because all everyone needs to do is:
[pip,] install pre-commit
pre-commit install
# git commit
# pre-commit run --all-files
# pre-commit autoupdate
https://pre-commit.com/Ask HN: What to use instead of Bash / Sh for scripting?
I'm at the point where I feel a certain fatigue writing Bash scripts, but I am just not sure of what the alternative is for medium sized (say, ~150-500 LOC) scripts.
The common refrain of "use Python" hasn't really worked fantastically: I don't know what version of Python I'm going to have on the system, installing dependencies is not fun, shelling out when needed is not pleasant, and the size of program always seemingly doubles.
I'm willing to accept something that's not on the system as long as it's one smallish binary that's available in multiple architectures. Right now, I've settled on (ab)using jq, using it whenever tasks get too complex, but I'm wondering if anyone else has found a better way that should also hopefully not be completely a black box to my colleagues?
A configuration management system may have you write e.g. YAML with Jinja2 so that you don't reinvent the idempotent wheel.
It's really easy to write dangerous shell scripts ("${@}" vs ${@} for example) and also easy to write dangerous Python scripts (cmd="{}; {}").
Sarge is one way to use subprocess in Python. https://sarge.readthedocs.io/en/latest/
If you're doing installation and configuration, the most team-maintainable thing is to avoid custom code and work with a configuration management system test runner.
When you "A shell script will be fine, all I have to do is [...]" and then you realize that you need a portable POSIX shell script and to be merged it must have actual automated tests of things that are supposed to run as root - now in a fresh vm/container for testing - and manual verification of `set +xev` output isn't an automated assertion.
> avoid custom code and work with a configuration management system test runner
ansible-molecule is a test runner for Ansible playbooks that can create VMs or containers on local or remote resources.
You can definitely just call shell scripts from Ansible, but the (parallel) script output is only logged after the script returns a return code unless you pipe the script output somewhere and tail that .
> manual verification of `set +xev` output isn't an automated assertion.
From "Bash Error Handling" https://news.ycombinator.com/item?id=24745833 : you can display the line number in `set -x` output by setting $PS4:
export PS4='+(${BASH_SOURCE}:${LINENO}) '
set -x
But that's no substitute for automated tests and a test runner that produces e.g. TAP output from test runner results: http://testanything.org/producers.html#shellEstonian Electronic Identity Card and Its Security Challenges [pdf]
Proud e-resident here :)
US citizen here trying to champion such a system in the US. Would you be willing to share pros and cons from your experience using the system day to day?
Are you working with a specific org or initiative? I'm also an American interested in this (and an e-resident who formerly worked for the Estonian government).
Mostly activist citizen efforts from the outside, as legislation is going to be required to appropriate funding and direction from Congress, and the legislators I interface with are busy with arguably more pressing work (unfortunate but entirely understandable, such are the times).
I intend to apply at the USDS for the Login.gov team in some capacity to help on the tech side if the necessary legislation can be put in place to support such an initiative. Their system already supports the DOD CAC (common access card), which is a short walk away from a citizen digital ID card (would be a different org and PKI root to administer and govern citizen cards, to grossly simplify).
Login.gov recently expanded to support city and local gov IAM needs (when they have ties to federal programs) [1] [2], so there is roadmap momentum and executive branch will. "Digital identity is a big deal. [3]" They're already serving 30 million users, and 500k DAUs, really just a matter of scaling up.
[1] https://www.gsa.gov/blog/2021/02/18/logingov-to-provide-auth...
[2] https://www.govloop.com/login-gov-expands-use-to-cities-stat...
I get the spirit and a m familiar with the CAC and general benefits, and using that for all .gov stuff is attractive (taxes, bills, FASFA, etc)
I think you’re glossing over two major implementation factors that need to be part of the discussion from get-go:
* how much CAC use is dependent on fairly specific govt tech infra to be widely deployed, and how will that work for everyone (ever tried to setup a CAC on a civ computer)
* key control, either extremely decentralized like the iPhone (lock yourself out of your passport?), or extremely centralized and the newest honeypot OPM holds (root cert for all e-citizen PKI). US currently doesn’t have the internal cybersec chops to run that at all (closest equivalent is CISA).
You're right to point this out, but my counter argument is that these are solvable pain points for such an implementation (either done today in competent zero trust security architectures in progressive orgs or at nation state levels such as Estonia). You won't need a CAC reader on computers, for the most part, if you have mobile apps that can perform the identity proofing (such that's already done with examples like Apple's biometrics systems, FIDO2/WebAuthn, Apply Pay, etc). You'd still have the card for interfacing at endpoints (banks, postal service, IRS/SSA offices, other trust anchors and government services endpoints). I already login to my US CBP Global Entry account with Login.gov and 2FA, why can I not today use the same IAM system to login to my Social Security account? Or my IRS tax account? Or to attest to my citizenship or other attributes that I'd normally need a certified document for (yuk!).
I'm not arguing for such a system without robust support and reasonable downgrades for failure scenarios (identity reproofing if you lose your digital ID, for example). I'm arguing for, admittedly challenging, digital ID modernization without disenfranchisement. I genuinely appreciate you pointing out the challenges, as they must be addressed.
US DHS CISA is absolutely a resource that needs to be leaned on heavily to implement what I describe, and to ensure a strong security posture throughout the federal government's infrastructure.
FWIU, DHS has funded [1] development of e.g W3C DID Decentralized Identifiers [2] and W3C Verifiable Credentials [3]:
[1] https://www.google.com/search?q=site%3Aw3.org+%22funded+by+t...
[2] https://www.w3.org/TR/did-core/
[3] https://www.w3.org/TR/vc-data-model/
Additional notes regarding credentials (certificates, badges, degrees, honorarial degrees, then-evaluated competencies) and capabilities models: https://news.ycombinator.com/item?id=19813340
westurner/blockchain-credential-resources.md: https://gist.github.com/westurner/4345987bb29fca700f52163c33...
Value storage and transmission networks have developed standards and implementations for identity, authentication, and authorization. ILP (Interledger Protocol) RFC 15 specifies "ILP addresses" for [crypto] ledger account IDs: https://interledger.org/rfcs/0015-ilp-addresses/
From "Verifiable Credentials Use Cases" https://w3c.github.io/vc-use-cases/ :
> A verifiable claim is a qualification, achievement, quality, or piece of information about an entity's background such as a name, government ID, payment provider, home address, or university degree. Such a claim describes a quality or qualities, property or properties of an entity which establish its existence and uniqueness. The use cases outlined here are provided in order to make progress toward possible future standardization and interoperability of both low- and high-stakes claims with the goals of storing, transmitting, and receiving digitally verifiable proof of attributes such as qualifications and achievements. The use cases in this document focus on concrete scenarios that the technology defined by the group should address.
FWIU, the US Department of Education is studying or already working with https://blockcerts.org/ for educational credentials.
Here are the open sources of blockchain-certificates/cert-issuer and blockchain-certificates/cert-verifier-js: https://github.com/blockchain-certificates
Might a natural-born resident get a government ID card for passing a recycling and environmental sustainability quiz.
Systemd makes life miserable, again, this time by breaking DNS
So, I made the mistake of updating my laptop from Fedora 31 to Fedora 33 last night. Normally this is fairly painless, as my laptop is one of the last machines I perform distribution upgrades. Today while doing some pole survey work out in the field, I tethered my laptop to my phone as has been done hundreds of times before. To my surprise, DNS doesn't work anymore, but only in web browsers. Both Firefox and Chrome can't resolve names anymore. Command line tools like ping and host work normally. WTF?
Why are distributions continuing to allow systemd to extend its tentacles deeper and deeper into more parts of Linux userland with poorly tested subsystem replacements for parts of Linux that have been stable for decades? Does nobody else consider this repeating pattern of rewrite-replace-introduce-new-bugs a problem? Newer is not all that better if you break what is a pretty bog standard and common use-case.
As well, Firefox now defaults to DoH (DNS over HTTPS), which may be bypassing systemd-resolved by doing DNS resolution in the app instead of calling `gethostbyname()` (`man gethostbyname`) and/or `getaddrinfo()`.
`man systemd-resolved` describes why there is new DNS functionality: security; "caching and validating DNS/DNSSEC stub resolver, as well as an LLMR and MulticastDNS resolver and responder".
From `man systemd-resolved` https://man7.org/linux/man-pages/man8/systemd-resolved.servi... :
> To improve compatibility, /etc/resolv.conf is read in order to discover configured system DNS servers, but only if it is not a symlink to /run/systemd/resolve/stub-resolv.conf, /usr/lib/systemd/resolv.conf or /run/systemd/resolve/resolv.conf
> [...] Note that the selected mode of operation for this file is detected fully automatically, depending on whether /etc/resolv.conf is a symlink to /run/systemd/resolve/resolv.conf or lists 127.0.0.53 as DNS server.
Is /etc/resolv.conf read on reload and or restart of the systemd-resolved service (`servicectl restart systemd-named`)?
Some examples of validating DNSSEC in `man delv` would be helpful.
NetworkManager (now with systemd-resolved) is one system for doing DNS configuration for zero or more transient interfaces:
man nmcli
nmcli connection help
nmcli c help
nmcli c h
nmcli c show ssid_or_nm_profile | grep -i dns
nmcli c modify help
man systemd-resolved
man delv
man dnssec-trust-anchors.dit's probably dnssec & your time is probably off.
jorunalctl -xe should be showing errors if so. probably check 'joirnalctl -xeu systemd-resolved'
use 'date' to check the time. if it's off, I use ntpdate to update my clock. but since dns isn't resolving I use 'dig pool.ntp.org @9.9.9.9' to resolve a ntp pool server ip address. then ntpdate [that up].
one of those things that hits me a bunch that I haven't automated quite yet. supposedly newer systemd can detect dnssec being bad in some cases & disable it, after a time, after a bunch of failures, but it either hasn't tripped in a number of cases for me or something else was odd in my envs. manually syncing the clock via ntp usually gets my dns working again.
> manually syncing the clock via ntp usually gets my dns working again.
Why is this necessary?
normally systemd-timedated does a good job keeping things in sync!
but some of my systems seem to have not great real-time clock batteries, and the system will forget the time. i think there might be some other circumstance that sometimes causes my system clock to be way out of whack, but i'm not sure what.
so that's why my clock doesn't work and needs sync.
the dnssec internet protocols are designed to guarantee the user that they have up to date, accurate, trustable records. this depends on your system knowing what time it is now. if your system is way ahead or way behind the actual time, the dnssec records it gets dont appear as valid. And systemd-resolved will reject them, if it is set up to respect DNSSEC.
Systemd-timedated: https://www.freedesktop.org/software/systemd/man/systemd-tim...
timedatectl set-ntp false && \
timedatectl set-ntp true
Src: https://github.com/systemd/systemd/blob/main/src/timedate/ti...Ask HN: How bad is proof-of-work blockchain energy consumption?
I'm not a blockchain/crypto expert by any means, but I've been hearing about how much energy the proof-of-work blockchains (Bitcoin, Ethereum, NFTs) consume. Unless I'm mistaken their whole design relies on cranking through more and more CPU cycles. Should we be more concerned about this? Are the concerns overblown? Are there ways to improve it without certain crypto currencies imploding?
A rational market would be choosing an asset that offers value storage and transmission (between points in spacetime) according to criteria: "security" (security theater, infosec, cryptologic competency assessment, software assurances), "future stability" (future switching costs), and "cost".
The externalities of energy production are what must be overcome if we are to be able to withstand wasteful overconsumption of electricity. Eventually, we could all have free clean energy and no lightsabers, right?
So, we do need to minimize wasteful overconsumption. Define wasteful in terms of USD/kWHr (irregardless of industry)? In terms of behavioral economics, why are they behaving that way when there are alternatives that cost <$0.01/tx and a fairly-aggregated comprehensive n kWhr of electricity?
TIL about these guys, who are deciding to somewhat-responsibly self-regulate in the interest of long-term environmental sustainability for all of the land: "Crypto Climate Accord". https://cryptoclimate.org/
"Crypto Climate Accord Launches to Decarbonize Cryptocurrency Industry Brings together the likes of CoinShares, ConsenSys, Ripple, and the UNFCCC Climate Champions to lead sustainability in blockchain and crypto" (2021) https://bit.ly/CryptoClimateAccord
> What are the objectives of the Crypto Climate Accord? The Accord’s overall objective is to decarbonize the global crypto industry. There are three provisional objectives to be finalized in partnership with Accord supporters:
> - Enable all of the world’s blockchains to be powered by 100% renewables by the 2025 UNFCCC COP Conference
> - Develop an open-source accounting standard for measuring emissions from the cryptocurrency industry
> - Achieve net-zero emissions for the entire crypto industry, including all business operations beyond blockchains and retroactive emissions, by 2040
Similar to the Paris Agreement (2015), stakeholders appear to be setting their own targets for sustainability in accordance with the Crypto Climate Accord (2021). https://cryptoclimate.org/accord/
Someone who's not in renewables could launch e.g. a "Satoshi Nakamoto Clean Energy Fund: SNCEF" to receive donations from e.g. hash pools and connect nonprofits with sustainability managed renewables. How many SNCEFs did you give this year and why?
#CleanEnergy
Thankfully, cheap energy fit for Bitcoin mining is increasingly not found in fossil fuels, but in renewables, wasted and “stranded” sources of energy, as we’ll dive into next. [1]
tl;dr; The perfect competition of bitcoin mining is/will drive bitcoin to eventually only work off the least expensive (which currently is renewable[2]) and even more so on marginally wasted energy[3] or "stranded energy" (like a natural gas flare on a well).[1]: https://bitcoinmagazine.com/business/bitcoin-will-save-our-e...
[2]: https://www.carbonbrief.org/solar-is-now-cheapest-electricit...
[3]: For example, to illustrate the point, I currently mine BTG on a spare GPU I had because my house needed to be heated anyways. I turn the miner on when I want heat in my house (about 600w) and turn it off when not needed... This is a micro example of what could be done across our entire society "Recent data suggest that space heating accounts for about 42 percent of energy use in U.S. residences and about 36 percent of energy use in U.S. commercial buildings."[4] . Every one of those space heaters could actually be a processor doing computation. In the future the "waster" will not be a miner but someone who turns electricity straight to heat without intermediate chains of value (such as computation or light).
[4]:https://www.epa.gov/rhc/renewable-space-heating#:~:text=Rece....
It's important to note for #3 that a mining GPU is a terribly inefficient space heater.
No, all space heaters are equally efficient. They all have perfect 100% efficiency, because they turn electrical power into heat. When your work product is heat and the waste product is also heat, then there really is no waste.
Technically in the case of cryptocurrency mining, some of the electrical power is turned into information rather than heat. In principle this reduces the amount of heat that you get, but in practice this isn’t even measurable. Most of the information is erased (discarded as useless), which turns it back into heat. Only a few hundred bits of information will be kept after successfully mining a block of transactions, and the amount of heat that costs you is fantastically small. Far smaller than you can measure.
More transistors per unit area, but also more efficient please! There should be demand for more efficient chips (semiconductors,) that are fully-utilized while depreciating on your ma's electricity bill (which is not yet (?) really determined by a market-based economy with intraday speculation to smooth over differences in supply and demand in the US). Oversupply of the electrical grid results in damage costs; which is why the price sometimes falls so low where there are intraday prices and supply has been over-subsidized pending the additional load from developing economies and EVs: Electric Vehicles.
New grid renewables (#CleanEnergy) are now less expensive than existing baseload; which makes renewables long term environment-rational and short term price-rational.
"Thermodynamics of Computation Wiki" (2018) https://news.ycombinator.com/item?id=18146854
> No, all space heaters are equally efficient. They all have perfect 100% efficiency, because they turn electrical power into heat. When your work product is heat and the waste product is also heat, then there really is no waste.
This heat must be distributed throughout the room somehow (i.e. a batteryless woodstove fan or a sterling engine that does work with the difference in entropy when there is a difference in entropy)
> Technically in the case of cryptocurrency mining, some of the electrical power is turned into information rather than heat. In principle this reduces the amount of heat that you get, but in practice this isn’t even measurable. Most of the information is erased (discarded as useless), which turns it back into heat.
See "Thermodynamics of Computation Wiki" re: a possible way to delete known observer-entangled bits while reducing heat/entropy (thus bypassing Landauer's limit for classical computation?)?
> Only a few hundred bits of information will be kept after successfully mining a block of transactions, and the amount of heat that costs you is fantastically small. Far smaller than you can measure.
Each n-symbol sequence in the hash function output does appear to have nearly equal frequency/probability of occurrence. Indeed, is Proof-of-Work worth the heat if you're not reusing the waste heat?
What does a PGP signature on a Git commit prove?
Totally though this was going to be some snarky article claiming they're worthless. Refreshing to see the article just ernestly answers the headline.
> To my knowledge, there are no effective attacks against sha1 as used by git
Perhaps im missing something, but wouldn't a chosen prefix collision be relavent here? I imagine the real reason is that cost to pull it off for sha1 is somewhere in the $10,000-$100,000 range (but getting cheaper every year) which is lots of $$$ to attack something without an obvious attack scenario that can justify it.
So the big problem with Git and SHA1 is that in many cases you are giving full control to an untrusted third party over the sha1 hash of something. For example, if you merge a binary file, it'd be quite easy for them to generate two different versions of the same file, with the same SHA1 digest, and then use the second version to cause problems in the future. You may also be able to modify a text file in the same way without getting noticed in review (I'm not up to speed on how advanced sha1 collision techniques are now).
Similarly, the git commits you merge themselves could have that done - the actual git commit serialization gives you a fair bit of ability to append stuff to it that isn't shown in the UI. That wouldn't affect the signed git commits. But it's still dubious to have the ability to change old history in a checkout.
Anyway, Git is apparently moving towards SHA256 support, so hopefully this problem will be fixed soon: https://lwn.net/Articles/823352/
> it'd be quite easy for them to generate two different versions of the same file
Citation needed. When SHA1 was cracked, it cost $110k worth of cloud computing. And there was some restriction on the two files which matched checksums. IIRC it was like the Birthday Paradox — you don’t pick one and find another sharing the same match, but you generate billions of mutations of similar binaries and statistically two would have the same checksum.
Not exactly easy, fast, cheap, or work with all use cases.
That nonce value could be ±\0 or 5,621,964,321e100; though for well-designed cryptographic hash functions it's far less likely that - at maximum difficulty - a low nonce value will result in a hash collision.
? How are nonces even involved in this?
Searching for the value to prepend or append that causes a hash collision is exactly the same as finding a nonce value at maximum difficulty (not less than the difficulty value, exactly equal to the target hash).
Mutate and check.
The term nonce has a broader meaning than how it is used in bitcoin. (Edit: i reworded this sentence from what i originally had)
That said, no, finding a collision and finding a preimage are very different things, and well the collision attacks on sha1 will involve a lot of guessing and checking, they are not generic birthday or bruteforce attacks but rely on weaknesses in sha-1 to be practical. They also do not make preimage attacks practical.
Brute forcing to find `hash(data_1+nonce) == hash(data_0)` differs very little from ``hash(data_1+nonce) < difficulty_level`. Write each and compare the cost/fitness/survival functions.
If the hash function is reversible - as may be discovered through e.g. mutation and selection - that would help find hashes that are equal and maybe also less than.
Practically, there are "rainbow tables" for very many combinations of primes and stacked transforms: it's not necessary to search the whole space for simple collisions and may not be necessary for preimages; we don't know and it's just a matter of time. "Collision attack" https://en.wikipedia.org/wiki/Collision_attack
Crytographic nonce > hashing: https://en.wikipedia.org/wiki/Cryptographic_nonce#Hashing
> Brute forcing
The attack being discussed is not a brute force attack (or not purely). If the best attack on sha1 was bruteforce than we would still be using it.
> to find `hash(data_1+nonce) == hash(data_0)` differs very little from ``hash(data_1+nonce) < difficulty_level`.
Neither of those are collision attacks (assuming you dont control the data variable). The first is a second pre-image and the second (with equality) would be a normal preimage.
The attack for sha1 under discussion (chosen prefix collision) is finding hash(a+b) == hash(c+d) where you control b and d (but not neccesarily a and c)
> Practically, there are "rainbow tables" for very many combinations of primes and stacked transforms:
What do primes or rainbow tables have to do with any of this? Primes especially. Rainbow tables are at least related to reversing hashes, if totally irrelavent to the subject at hand, but how did you get to primes?
Practically, iff browsers still relied upon SHA-1 to fingerprint and pin and verify certificates instead of the actual chain, and there were no file size limits on x.509 certificates, some fields in a cert (e.g. CommonName and SAN) would be chosen and other fields would then potentially be nonce.
In context to finding a valid cert with a known good hash fingerprint, how many prime keypairs could there be to precompute and cache/memoize when brute forcing.
"SHA-1 > Cryptanalysis and validation " does list chosen prefix collision as one of many weaknesses now identified in SHA-1: https://en.wikipedia.org/wiki/SHA-1#Cryptanalysis_and_valida...
This from 2008 re: the 200 PS3s it took to generate a rogue CA cert with a considered-valid MD5 hash: https://hackaday.com/2008/12/30/25c3-hackers-completely-brea...
... Was just discussing e.g. frankencerts the other day: https://news.ycombinator.com/item?id=26605647
Breakthrough for ‘massless’ energy storage
"Researchers from Chalmers University of Technology have produced a structural battery that performs ten times better than all previous versions"
It's 10 times more everything, just like every other battery breakthrough in the last several decades. AND, you get to use it as a building material?
- Sounds too good to be true: Check
- Sounds like every other "big" breakthrough: Check
- Article light on science and heavy on assertions: Check
- Skepticism engaged: Check
It may really be 10x, but you've mischaracterized what the paper is claiming a 10x breakthrough for. In fairness, the linked article is light on the specifics.
The claim is:
__This battery holds 10x more charge per kilo of material than previous attempts at STRUCTURAL (massless) battery materials__.
To be specific, this material holds only _ONE FIFTH_ the charge of what your smartphone's battery can hold per kilo of battery. No laws of thermodynamics is being broken and this is not at all about battery chemistry. It's all about the 'physics' of construction materials: About how this stuff is made in the factory and layered. The basic battery chemistry going on is not much different from what's been available for years - the interesting part is how this material encases it.
You can't make a car by building the chassis out of smartphone batteries. But the promise of this paper is that you CAN build the car chassis out of this battery, and even if this battery is only 20% as effective, a car chassis is rather large, and you needed it anyway, so every drop of power you can store in the chassis itself was effectively 'free' - hence the somewhat hyperbolous 'massless' terminology.
> You can't make a car by building the chassis out of smartphone batteries
They're called Structural batteries (or [micro]structural super/ultracapacitors)
"Carmakers want to ditch battery packs, use auto bodies for energy storage" (2020,) https://arstechnica.com/cars/2020/11/carmakers-want-to-ditch...
This is literally what TFA is about
OpenSSL Security Advisory
If the cloud companies just paid a team of five people 200K each to spend a year rewriting OpenSSL from scratch, they would save multiple millions in scrambling to deploy bug fixes.
Nope, they'd just create new software with different bugs that have not been discovered yet. Then we'd all be scrambling to fix those bugs.
Not if the implementation is formally verified, like miTLS [0] and EverCrypt [1]. Parts of the latter were integrated into Firefox, which even provided a performance boost (10x in one case) [2].
I think what is needed is something like EverCrypt but for TLS. Or in other words, something like miTLS but which extracts to C and/or Assembly, to avoid garbage collection and for easy interoperation with different programming languages (preferably including an OpenSSL-compatible API for backwards compatibility).
[0]: https://mitls.org/ [1]: https://hacl-star.github.io/HaclValeEverCrypt.html [2]: https://blog.mozilla.org/security/2020/07/06/performance-imp...
Formally verified against what? and what assumption were made?
miTLS is formally verified against the TLS spec on handshaking, assuming the lower-level crypto routine are good. It is not even free from timing attack.
EverCrypt have stronger proof, but it is only safe as in not crashing and correct as in match the spec in all valid input. It is not proved to be free from DoS or invalid input attack.
OpenSSL do more then TLS. Lots of interesting thing are in the cert format parsing and management.
> Formally verified against what? and what assumption were made?
Well, tell me again, what is OpenSSL formally verified against? What assumptions were made in OpenSSL?
Formal verification does not eliminate very large classes of bugs only when absolutely everything is formally verified, including timing attacks. Instead, it consistently produces more reliable software, many times even when only certain basic properties are proved (such as, memory safety, lack of integer overflows or division by zero, etc).
Formal verification can be a continuum, like testing, but proven to work for all inputs (that possibly meet certain conditions, under certain assumptions). The assumptions and properties that are proven can always be strengthened later, as seen in multiple real-world projects (such as seL4 and others).
The result is code that is almost always a lot more bug-free than code that is not formally verified. And as I said, more properties can be proven over time, especially as new classes of attacks are discovered (e.g. timing attacks, speculation attacks in CPUs, etc).
To formally verify an implementation you need a... formal description of what is to be implemented and verified.
Constructing a formal specification from RFC8446 is possible.
Constructing a formal specification for PKIX... is not. PKIX is specified by a large number of RFCs and ITU-T/ISO specs, some of which are more formal than others. E.g., constructing a formal specification for ASN.1 should be possible (though a lot of work), while constructing a formal specification for certificate validation is really hard, especially if you must support complex PKIs like DoD's. Checking CRLs and OCSP, among other things, requires support for HTTP, and even LDAP, so now you have... a bunch more RFCs to construct formal descriptions of.
And there had better be no bugs in the formal descriptions you construct from all these specs! Recall, they're mostly not formal specs at all -- they're written in English, with smatterings of ASN.1 (which is formal, though the specs for it, though they're very very good, mostly aren't formal).
The CVE in question is in the PKIX part of OpenSSL, not the TLS implementation.
What you're asking for is not 5 man-years worth of work, but tens of man-decades. The number of people with deep knowledge of all this stuff is minute as it is -- maybe just a handful, tens at most. The number of people with deep knowledge of a lot of this stuff is larger, but still minute. So you're asking to spend decades' worth of a tiny band of people's time on this project, when there are other valuable things for them to do.
The number of people who can do dev work in this space is much larger, of course -- in the thousands. But very few of them have the right expertise to work on a formal, verified implementation of PKIX.
Plus, it's all a moving target.
Sure, we could... train a lot of people just for such a project, but it takes time to do that, and it still takes time from that tiny band of people who know this stuff really well.
I'm afraid you're asking for unobtanium.
EDIT: Plus, there's probably tens of millions of current dollars' (if not more) worth of development embodied in OpenSSL as it stands. It's probably at least that much to replace it with a verified implementation, and probably much more because the value of programmers expert enough to do it is much more than the $200K/year suggested above (even if you train new ones, it would take years of training, and then they would be just as valuable). I think a proper , formally verified replacement of OpenSSL would probably run into the hundreds of millions, especially if it's one huge project since those tend to fail.
Well, sure, if your start with those premises, then I'm not surprised that you reach the conclusion that the goal is unachievable.
First of all, if constructing a formal specification for PKIX is not possible, then that should be telling you that it either needs to be simplified, better specified or scrapped altogether for something better (the latter would require an extremely large transition period, I'm imagining, so the first two are much preferred in this situation).
Otherwise, how can you be sure that any implementation in fact implements it correctly?
> And there had better be no bugs in the formal descriptions you construct from all these specs!
Well, I don't think that is true. You should in fact allow bugs in the formal description, otherwise how will you ever get anything done in such a project?
You see, having a formal description with a bug is much better than having no formal description at all.
If you have no formal description, you can't really tell if your code has bugs. If you have a buggy formal description, then you are able to catch some bugs in the implementation and the implementation can also catch some bugs in the formal description.
Also, some parts of the formal description can catch bugs in other parts of the formal description.
So the end result can be strictly better than the status quo.
> Plus, it's all a moving target.
Sure, but hopefully it's a moving target moving in the direction of simplification rather than getting more complex, otherwise things will just get worse rather than get better, regardless if we keep with the status quo or not. I'm not a TLS expert by any means but I think TLS 1.3 moved in that direction for some parts of the protocol, at least (if I'm not mistaken).
Also, I think you are not fully appreciating that formal verification can be done incrementally.
You can start by doing the minimum possible, i.e. simply verifying that your code is free of runtime errors, which would eliminate all memory safety-related bugs, including Heartbleed.
This would already be better than reimplementing in Rust, because the latter can protect from memory safety bugs but not other runtime bugs (such as division by zero, unexpected panics, etc).
BTW, this minimal verification effort would already eliminate the second bug in this security advisory.
You can then verify other simple properties, even function by function, no complicated models are even necessary at first.
For example, you could verify that your function that verifies a CA certificate, when passed the STRICT flag, is really more strict than when not passed the STRICT flag.
This would eliminate the first bug in this security advisory and all other similar bugs in the same function call-chain.
BTW, what I just said is really easy to specify, and I'm guessing it's also very easy to prove, since I'm guessing that the strict checks are just additional checks, while the normal checks are shared between the two verification modes.
Many other such properties are also easy to prove. The more difficult properties/models, or even full functional verification, can be implemented by more expert developers or even mathematicians.
I think the problem is also that OpenSSL devs, like the vast majority of devs, probably have no desire/intention/motivation/ability to do formal verification, otherwise you could even do this in the OpenSSL code base itself (although that is not ideal because verifying a program written in C requires more manual work than verifying it in a simpler language whose code extracts to C).
I'm also guessing that your budget estimate is an exaggeration since miTLS and EverCrypt, although admittedly projects whose scope has still not reached your ambitious goals (i.e. full functional verification of all layers in the stack), was probably done with a much smaller budget.
And it's not like you can't build on top of that and incrementally verify more properties over time, e.g. more layers of the stack or whatever.
You don't need a huge mega-project, just a starting point, and sufficient motivation.
> Well, sure, if your start with those premises, then I'm not surprised that you reach the conclusion that the goal is unachievable.
I didn't say unachievable. I said costly.
> First of all, if constructing a formal specification for PKIX is not possible, then that should be telling you that it either needs to be simplified, better specified or scrapped altogether for something better (the latter would require an extremely large transition period, I'm imagining, so the first two are much preferred in this situation).
The specs for the new thing would basically have to be written in Coq or similar. Even if you try real hard to keep it small and not make the... many mistakes made in PKIX's history... it would still be huge. And it would be even less accessible than PKIX already is.
> For example, you could verify that your function that verifies a CA certificate, when passed the STRICT flag, is really more strict than when not passed the STRICT flag.
That's just an argument for better testing. That's not implementation verification.
> This would eliminate the first bug in this security advisory and all other similar bugs in the same function call-chain.
Only if you thought to write that test to begin with. Writing a test for everything is... not really possible. SQLite3, one of the most tested codebases in the world, has a private testsuite that gets 100% branch coverage, and even that is not the same as testing every possible combination of branches.
> BTW, what I just said is really easy to specify, and I'm guessing it's also very easy to prove, since I'm guessing that the strict checks are just additional checks, while the normal checks are shared between the two verification modes.
It's not. The reason the strictness flag was added was that OpenSSL was historically less strict than the spec demanded. It turns out that when you're dealing with more than 30 years of history, you get kinks in the works. It wouldn't be different for whatever thing replaces PKIX.
> I think the problem is also that OpenSSL devs, like the vast majority of devs, probably have no desire/intention/motivation/ability to do formal verification, otherwise you could even do this in the OpenSSL code base itself [...]
You must be very popular at parties.
> I'm also guessing that your budget estimate is an exaggeration since miTLS and EverCrypt, [...]
Looking at miTLS, it only claims to be an implementation of TLS, not PKIX. Not surprising. EverCrypt is a cryptography library, not a PKIX library.
> That's just an argument for better testing. That's not implementation verification.
No, I'm not talking about testing, I'm talking about really basic formal verification:
forall (x: Certificate), verify_certificate(x, flags = 0) == invalid ==> verify_certificate(x, flags = STRICT) == invalid
This is trivial to specify and almost as trivial to prove to be correct for all inputs. Testing can't do that, no matter how good your testing.
> Only if you thought to write that test to begin with. Writing a test for everything is... not really possible.
I agree, but I'm not talking about testing. I'm talking about formal verification.
> It's not. The reason the strictness flag was added was that OpenSSL was historically less strict than the spec demanded. It turns out that when you're dealing with more than 30 years of history, you get kinks in the works. It wouldn't be different for whatever thing replaces PKIX.
That's totally fine and wouldn't affect the verification of that function at all. This is a very simple property to verify and it would avoid this bug and all similar bugs in that function (even if that function calls other functions, no matter how large or complex they are).
Other functions also have such easy to verify properties, this is not hard at all to come up with (although sure, some more difficult properties might be harder to verify).
> You must be very popular at parties.
I didn't want to be dismissive of OpenSSL devs or other devs in general, I just find it frustrating that there are so many myths surrounding this topic, and a lot less education, interest and investment than I think there should be, nowadays.
You are confusing verification as in "certificate" with verification as in "theorem proving", and you are still assuming a formal description of what to verify (which I'll remind you: doesn't exist). And then you go on to talk about myths and uneducated and uninterested devs. Your approach is like tilting at windmills, and will achieve exactly as much.
If I understood you correctly, then I am not confusing those two things.
Maybe you haven't noticed but I actually wrote a theorem about a hypothetical verify_certificate() function in my previous comment. Maybe you also haven't noticed, but I didn't need a formal description of how certificate validation needs to be done to write that theorem.
And I assure you, if the implementation of verify_certificate() is anything except absolute garbage, it would be very easy to prove the theorem to be correct. I actually have some experience doing this, you know? I'm not just parroting something I read, I've proved code to be 100% correct multiple times using formal verification (i.e. theorem proving) tools. That is, according to certain basic and reasonable assumptions, of course, like e.g. the hardware is not faulty or buggy while running the code, and that the compiler itself is not buggy, which would be a lot more rare than a bug in my code -- and even then, note that compilers and CPUs can be (and have been) formally verified.
Maybe you don't agree with my approach but I think it's the most realistic one for a project such as this and I am absolutely confident it would be practical, would completely eliminate most (i.e. more than 50%, at least) existing bugs with minimal verification effort (which almost anyone would be capable of doing with minimal training), and would steadily become more and more bug-free with additional, incremental verification (i.e. theorem proving) and refactoring effort.
https://project-everest.github.io/ :
> Focusing on the HTTPS ecosystem, including components such as the TLS protocol and its underlying cryptographic algorithms, Project Everest began in 2016 aiming to build and deploy formally verified implementations of several of these components in the F* proof assistant.
> […] Code from HACL*, ValeCrypt and EverCrypt is deployed in several production systems, including Mozilla Firefox, Azure Confidential Consortium Framework, the Wireguard VPN, the upcoming Zinc crypto library for the Linux kernel, the MirageOS unikernel, the ElectionGuard electronic voting SDK, and in the Tezos and Concordium blockchains.
S2n is Amazon's formally verified TLS library. https://en.wikipedia.org/wiki/S2n
IDK about a formally proven PKIX. https://www.google.com/search?q=formally+verified+pkix lists a few things.
A formally verified stack for Certificate Transparency would be a good way to secure key distribution (and revocation); where we currently depend upon a TLS library (typically OpenSSL), GPG + HKP (HTTP Key Protocol).
Fuzzing on an actual hardware - with stochastic things that persist bits between points in spacetime - is a different thing.
Funny, the first hit for that search you linked is... my comment above. The "few things" other than that are for alternatives to PKIX, which is fine and good, but PKIX will be with us for a long time yet. As for Everest, it jives with what I wrote above, that verified implementations of TLS are feasible (Everest also implements QUIC and similar), but -surprise!- not listed is PKIX.
I know, it sounds crazy, really crazy, but PKIX is much bigger than TLS. It's big. It's just big.
The crypto, you can verify. The session and presentation layers, you can verify. Heck, maybe you can verify your app. PKIX implementations of course can be verified in principle, but in fact it would require a serious amount of resources -- it would be really expensive. I hope someone does it, to be sure.
I suppose the first step would be to come up with a small profile of PKIX that's just enough for the WebPKI. Though don't be fooled, that's not really enough because people do use "mTLS" and they do use PKINIT, and they do use IPsec (mostly just for remote access) with user certificates, and DoD has special needs and they're not the only ones. But a small profile would be a start -- a formal specification for that is within the realm of the achievable in reasonably short order, though still, it's not small.
Both a gap and an opportunity; someone like an agency or a FAANG with a budget for something like this might do well to - invest in the formal methods talent pipeline and - very technically interface with e.g. Everest about PKIX as a core component in need of formal methods.
"The SSL landscape: a thorough analysis of the X.509 PKI using active and passive measurements" (2011) ... "Analysis of the HTTPS certificate ecosystem" (2013) https://scholar.google.com/scholar?oi=bibs&hl=en&cites=16545...
TIL about "Frankencerts": Using Frankencerts for Automated Adversarial Testing of Certificate Validation in SSL/TLS Implementations (2014) https://scholar.google.com/scholar?cites=3525044230307445257... :
> Our first ingredient is "frankencerts," synthetic certificates that are randomly mutated from parts of real certificates and thus include unusual combinations of extensions and constraints. Our second ingredient is differential testing: if one SSL/TLS implementation accepts a certificate while another rejects the same certificate, we use the discrepancy as an oracle for finding flaws in individual implementations.
> Differential testing with frankencerts uncovered 208 discrepancies between popular SSL/TLS implementations such as OpenSSL, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc.
W3C ld-signatures / Linked Data Proofs, and MerkleProof2017: https://w3c-ccg.github.io/lds-merkleproof2017/
"Linked Data Cryptographic Suite Registry" https://w3c-ccg.github.io/ld-cryptosuite-registry/
ld-proofs: https://w3c-ccg.github.io/ld-proofs/
W3C DID: Decentralized Identifiers don't solve for all of PKIX (x.509)?
"W3C DID x.509" https://www.google.com/search?q=w3c+did+x509
Thanks for the link about frankencerts!
How much total throughput can your wi-fi router really provide?
Out of curiosity does some company benchmark other types of commercial networking devices like this? Things like load balancers, DDoS protection, switches, etc.? How do businesses shop for such products without having a standard measure from an independent third party?
netperf and iperf are utilities for measuring network throughput: https://en.wikipedia.org/wiki/Iperf
It's possible to approximate the https://dslreports.com/speedtest using the flent CLI or QT GUI (which calls e.g. fping and netperf) and isolate out ISP variance by running a netperf server on a decent router and/or a workstation with a sufficient NIC (at least 1Gbps). https://flent.org/tests.html
`dslreports_8dn`: https://github.com/tohojo/flent/blob/master/flent/tests/dslr...
From https://flent.org/ :
> RRUL: Create the standard graphic image used by the Bufferbloat project to show the down/upload speeds plus latency in three separate charts:
> `flent rrul -p all_scaled -l 60 -H address-of-netserver -t text-to-be-included-in-plot -o filename.png`
In 2021, most routers - even with OpenWRT and hardware-offloading - cannot actually push 1 Gigabit over wired Ethernet, though the port spec does say 1000 Mbps.
The Most Important Scarce Resource Is Legitimacy
It seems that nearly every "public good" in the Ethereum ecosystem got gobbled up by VCs and pivoted to for-profit endeavors. I don't mind VCs and I don't mind people seeking profit, but this moral high ground that Ethereum has been trying to build for itself is delusional. In practice it is just a marketing gimmick.
An astute observer may have noticed that Vitalik has a habit of coming up with definitions that only himself and the Ethereum foundation can meet the standards of.
The reality is everyone is just trying to make money. I think we are better off coming to terms with that instead of living in a collective fantasy.
Public goods are by definition non-rivalrous and non-excludable. How can they be gobbled?
https://en.wikipedia.org/wiki/Public_good_(economics)
(I'm not being pedantic here. The concept of a public good, in this technical sense, is a very important concept to have -- and it's specifically the kind of "public goods" the article was talking about, along with ideas for improving their tendency to be under-supplied. Goods that can be gobbled up by someone tend to be produced at closer to the optimum amount, since there's a less diffuse private incentive for their production.)
Public goods ... Welfare economics ... Social choice theory, Arrow's, Indifference curve: https://en.wikipedia.org/wiki/Indifference_curve
People do collectibles; commemorative plates.
A few notes on message passing
[deleted]
The big problem nobody talks about with actors and message passing is non-determinism and state.
If two otherwise independent processes A and B are sending messages to a single process C, it's non-deterministic which arrives at C first. This can create a race condition. C may respond differently depending on which messages arrives first - A's or B's. This is also effectively an example of shared mutable state - the state of C, which can be mutated whenever a message is received - is shared with A and B because they're communicating with it and their messages work differently based on the current state.
Non-determinism, race-conditions, shared mutable state: the absolute opposite of what we want when dealing with concurrency and parallelism.
> Luckily, global orders are rarely needed and are easy to impose yourself (outside distributed cases): just let all involved parties synchronize with a common process.
When there are multiple agents/actors in a distributed system, and the timestamp resolution is datetime64, and clock synchronization and network latency are variable, and non-centralized resilience is necessary to eliminate single points of failure, global ordering is impractical to impossible because there is no natural unique key with which to impose a [partial] preorder [1][2]: there are key collisions when you try and merge the streams.
Just don't cross the streams.
[1] https://en.wikipedia.org/wiki/Preorder_(disambiguation)
[2] https://en.wikipedia.org/wiki/Partially_ordered_set
The C in CAP theorem is for Consistency [3][4]. Sequential consistency is elusive because something probably has to block/lock somewhere unless you've optimally distributed the components of the CFG control flow graph.
[3] https://en.wikipedia.org/wiki/Consistency_model
[4] https://en.wikipedia.org/wiki/CAP_theorem
FWIU, TLA+ can help find such issues. [5]
[5] https://en.wikipedia.org/wiki/TLA%2B
Wouldn't a partial order be possible using logical clocks? Vector clocks probably don't satisfy "practical" but a hybrid logical clock is practical if you can assume clock synchronisation is within some modest delta.
> if you can assume clock synchronisation is within some modest delta.
From experience with clocks, you should really be actively testing the delta, and not just assuming it.
The assumption would be the limit you actively test for.
The Lamport timestamp: https://en.wikipedia.org/wiki/Lamport_timestamp :
> The Lamport timestamp algorithm is a simple logical clock algorithm used to determine the order of events in a distributed computer system. As different nodes or processes will typically not be perfectly synchronized, this algorithm is used to provide a partial ordering of events with minimal overhead, and conceptually provide a starting point for the more advanced vector clock method.
Duolingo's language notes all on one page
Succinct. What a useful reference.
An IPA (International Phonetic Alphabet) reference would be helpful, too. After taking linguistics in college, I found these Sozo videos of US english IPA consonants and vowels that simultaneously show {the ipa symbol, example words, someone visually and auditorily producing the phoneme from 2 angles, and the spectrogram of the waveform} but a few or a configurable number of [spaced] repetitions would be helpful: https://youtu.be/Sw36F_UcIn8
IDK how cartoonish or 3d of an "articulatory phonetic" model would reach the widest audience. https://en.wikipedia.org/wiki/Articulatory_phonetics
IPA chart: https://en.wikipedia.org/wiki/International_Phonetic_Alphabe...
IPA chart with audio: https://en.wikipedia.org/wiki/IPA_vowel_chart_with_audio
All of the IPA consonant chart played as a video: "International Phonetic Alphabet Consonant sounds (Pulmonic)- From Wikipedia.org" https://youtu.be/yFAITaBr6Tw
I'll have to find the link of the site where they playback youtube videos with multiple languages' subtitles highlighted side-by-side along with the video.
Found it: https://www.captionpop.com/
It looks like there are a few browser extensions for displaying multiple subtitles as well; e.g. "YouTube Dual Subtitles", "Two Captions for YouTube and Netflix"
Ask HN: The easiest programming language for teaching programming to young kids?
Hi,
I want to start a small community pilot project to help young kids, 8 and above, get interested in programming. We will use video games and robotics projects. We want to keep our tech stack as simple as possible. Here are some of the choices:
Godot + Aurdino: We can use C in Godot and Aurdino. Aurdino might be more interesting for kids as opposed neatly packaged Lego Kits.
Apple SpriteKit + Lego Mindstorm: We can use Swift with Legos. But cost will be higher.
Some of the projects we are thinking are:
Game-ish:
1. Sound visualizer like how Winamp and old school visualization were. Use speakers. And various other ideas around these concepts.
2. AR project that shows the world around you in cartoonish style. Swap faces etc.
3. Of cousre, platform games.
Robotics projects:
I see a lot of tutorials for Arduino such as robots that follow sound or light, or stuff like lights display. We will use mostly those.
Some harder project ideas I have are for drones, boats, and other navigational vehicles. This is why I want to use Arduino. But is C going to be too hard for young kids to play with?
What do you recommend? If this works, I would like to expand it and start a company around it.
awesome-python-in-education > "Python suitability for education" lists a few justifications for Python: https://github.com/quobit/awesome-python-in-education#python...
There is a Scratch Jr for Android and iOS. You can view Scratch code as JS. JS does run in a browser, until it needs WASI.
awesome-robotics-libraries: https://github.com/jslee02/awesome-robotics-libraries
FWIU, ROS (Robot Operating System) is now installable with Conda/Mamba. There's a jupyter-ros and a jupyterlab-ros extension: https://github.com/RoboStack/jupyter-ros
I just found this: https://coderdojotc.readthedocs.io/projects/python-minecraft...
> This documentation supports the CoderDojo Twin Cities’ Build worlds in Minecraft with Python code group. This group intends to teach you how to use Python, a general purpose programming language, to mod the popular game called Minecraft. It is targeted at students aged 10 to 17 who have some programming experience in another language. For example, in Scratch.
K12CS Framework has your high-level CS curriculum: https://k12cs.org/ [PDF]: https://k12cs.org/wp-content/uploads/2016/09/K%E2%80%9312-Co...
Educational technology > See also links to e.g. "Evidence-based education" and "Instructional theory" https://en.wikipedia.org/wiki/Educational_technology https://en.wikipedia.org/wiki/Educational_technology
Thank you for these resources, reviewing them now.
Yw. Np. So I just searched for "site: readthedocs.io kids python" https://www.google.com/search?q=site%3Areadthedocs.io+kids+p... and found a few new and old things:
SensorCraft (pyglet (Python + OpenGL)) from US AFRL Sensors Directorate has e.g. Gravity, Rocket Launch, and AI tutorials:
> Most people are familiar with Minecraft [...] for this project we are using a Minecraft type environment created in the Python programming language. The Air Force Research Laboratory (AFRL) Sensors Directorate located in Dayton, Ohio created this guide to inspire kids of all ages to learn to program and at the same time get an idea of what it is like to be a Scientist or Engineer for the Air Force. We created this YouTube video about SensorCraft
https://sensorcraft.readthedocs.io/en/latest/intro.html
`conda install -c conda-forge -y pyglet` should probably work. Miniforge on Win/Mac/Lin is an easy way to get Python installed on anything including ARM64 for a RPi or similar; `conda create -n scraft; conda install -c conda-forge -y python=3.8 jupyterlab jupytext jupyter-book pyglet` . If you're in a conda env, `pip install` should install things within that conda env. Here's the meta.yaml in the conda-forge pyglet-feedstock: https://github.com/conda-forge/pyglet-feedstock/blob/master/...
"BBC micro:bit MicroPython documentation" https://microbit-micropython.readthedocs.io/en/latest/
$25 for a single board-computer with a battery pack and a case (and curricula) is very reasonable: https://en.wikipedia.org/wiki/Micro_Bit
> The [micro:bit] is described as half the size of a credit card[10] and has an ARM Cortex-M0 processor, accelerometer and magnetometer sensors, Bluetooth and USB connectivity, a display consisting of 25 LEDs, two programmable buttons, and can be powered by either USB or an external battery pack.[2] The device inputs and outputs are through five ring connectors that form part of a larger 25-pin edge connector. (V2 adds a Mic and a Speaker)
Raspberry Pi for Kill Mosquitoes by Laser
This work (I think) worked with the mosquitos 30cm away with a servo scanning Pi Camera (1080p) and a 1W laser.
To work in the real world - cover a whole room or terrace - presumably a much higher resolution camera (or much faster scanning system) would be required. Even a 1W laser is dangerous to eyesight, if it was being fired at targets mingling with people.
The system could be mounted on small drones that would patrol larger areas - but the idea of robotic drones armed with lasers roaming around is beginning to sound worse than the mosquitos.
Good luck detecting a mosquito optically from a distance of several meters using a cheap camera and Raspberry Pi. Oh and you want to do from a moving drone. That will certainly make it work!
Just look at the images in the article - the guy's best result was detecting a black speck appearing on a nearby white wall with some 60-70% reliability (based on his own numbers). So you would be missing a lot of mosquitoes - but will be happy firing the laser at random shadows and what not. And that was in a completely stationary setup and controlled lab conditions, i.e. not at all something resembling a typical poorly lit room!
This article is BS. Preprints are not peer reviewed (i.e. nobody has checked anything in it - so could even be a complete hoax), it is a pretty typical gadgetry style paper (we do it because we can, not because it makes sense) you do at when you need to fill up your resume with research papers (e.g. for keeping/obtaining a job reasons).
The "save the world" (mosquito control, diseases, etc.) justification is also par for the course for this type of crappy paper. Anyone who seriously thinks that one could control mosquito problem by shooting them one by one by a laser is delusional.
But neural networks and "AI" are being used, so it has to be cutting edge groundbreaking stuff, right?
BTW, this nonsense idea has been floated as a publicity stunt a few years ago (including a slow motion video of a laser burning off wing of a mosquito in flight) and it seems that some Russian PhD student from a fairly obscure uni either didn't do their research or has reinvented the wheel (or just plain copied the thing without attribution). The list of irrelevant or only very tangentially relevant (it is about mosquitoes, so in scope, right?) references is a dead giveaway there (paper on mosquitoes spreading zika? seriously?).
Here, it was even on National Geographic in 2010(!): https://www.youtube.com/watch?v=BKm8FolQ7jw
Oh and that was supposed to be a handheld device to boot. With the same "save the world from malaria" spiel too. I wonder what are the owners of the company that was pushing this concept to investors back then trying to sell today ...
There are actually multiple videos on Youtube showing products from different companies that were attempting to push this as some sort of viable concept.
> using a cheap camera
even if you had a RED Komodo feeding uncompressed 4K DCI 60fps video to a pci-express bus capture card, the sensor resolution and tiny size of mosquitoes means that unless the lighting conditions are just right, and the mosquito is somehow highlighted against a background, it's going to be very hard to pick them out at distances of 2 or more meters.
and that's before you get into the software problem of processing the fire hose of data that is 4096x2160 at 60fps raw. and the hardware cost of a very serious workstation class PC capable of taking the capture at 1:1 realtime.
possibly a lidar based sensor or something might be more suitable to locating the x/y/z position of mosquitoes in a few meter area.
Mosquitos, as we all know, have a highly distinctive auditory signature.
A phased microphone array is the only sensible approach to localized mosquito detection. It would probably work reasonably well.
The problem is that the entire field is patent encumbered, because Myhrvold's company Intellectual Ventures has done some research on it, and no on in their right mind would go up against those guys.
Yeah, they already did sharks with lasers. IDK what the licensing terms are on that
ONE MILLION DOLLARS
Donate Unrestricted
The article mentioned the Gates foundation which has terrible records in education initiatives:
- The big failure in small schools initiative: https://marginalrevolution.com/marginalrevolution/2010/09/th...
- The big failure in teacher evaluation initiative: https://www.businessinsider.com/bill-melinda-gates-foundatio...
- The big failure in Common Core initiative: https://www.washingtonpost.com/news/answer-sheet/wp/2016/06/...
Now it is funding anti-racist math: https://equitablemath.org/
Unbelievable.
Rather than diminishing the efforts of others, you could start helping by describing your own efforts to improve education (in order to qualify your ability to assess the mentioned and other efforts to improve education and learning)
In context to seed and series funding for a seat on a board of a for-profit venture, an NGO non-profit organization can choose whether to accept restricted donations and government organizations have elected public servant leaders who lead and find funding.
Works based on Faust: https://en.wikipedia.org/wiki/Works_based_on_Faust
Bitcoin Is Time
I'm ready to be downvoted to oblivion, especially by all the Bitcoin purists, but I hope my message sparks at least some curiosity.
Bitcoin is getting "weaker" every year for several reasons:
- emerging centralization due to economies of scale. If this trend continues the mining power will be so consolidated that a 51% attack will be likely. The Nakamoto coefficient is already at 4, and tending towards 3.
- energy usage due to Proof of Work is growing astronomically, and the higher the price of bitcoin, the less incentive there is to use renewable energy. Bitcoin uses more energy than the country of Argentina.
- transaction times are SLOW, and expensive. The lightning network is incredibly buggy and won't actually solve the problems it's promising.
There is a better alternative. RaiBlocks, named nano since 2018. It got a bad rap due to the BitGrail hack, but it's picking up steam again. The developer community is great, the main dev team on the Nano protocol has been consistently chugging along regardless of the 3 years of crypto winter on a shoestring budget and the nano community is made up of users, not speculators.
This article sums it up quite nicely: https://senatusspqr.medium.com/why-nano-is-the-ultimate-stor...
I'm happy to receive downvotes, but all I ask in return is that you give it a read with an open mind.
> The lightning network is incredibly buggy and won't actually solve the problems it's promising.
Not an expert but I found that a readable summary of different layer 2 scaling strategies and why Ethereum developers prefer Rollups instead of Channels (like Lightning):
"Bitcoin scalability problem" could link to the Ethereum design docs: https://en.wikipedia.org/wiki/Bitcoin_scalability_problem
The Ethereum design docs could link to direct-listed premined [stable] coins as a solution for Proof of Work and TPS reports: https://github.com/flare-eng/coston#smart-contracts-with-xrp
(edit) re: n-layer solutions: The https://interledger.org/ RFCs and something like Transaction Permission Layer (TPL) will probably be helpful for interchain compliance.
> Interledger is not tied to a single company, blockchain, or currency.
From https://tplprotocol.org/ :
> The challenge: Current blockchain-based protocols lack an effective governance mechanism that ensures token transfers comply with requirements set by the project that issued the token.
> Projects need to set requirements for a variety of reasons. For instance, remaining compliant with securities laws, limiting transfer to beta testers, or limiting transfer to a particular geo-spatial location. Whatever your reason, if a requirement can be verified by a third-party, TPL will be able to help.
In the US, S-Corps can't have international or more than n shareholders, for example; so if firms even wanted to issue securities on a first-layer network, they'd need an extra-chain compliance mechanism to ensure that their issuance is legal pursuant to local, sovereign, necessary policies. Re-issuing stock certificates is something that has to be done sometimes. When is it possible to cancel outstanding tokens?
Foundational Distributed Systems Papers
From "Ask HN: Learning about distributed systems?" https://news.ycombinator.com/item?id=23932271 :
> Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...
> awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems
And links to more lists of distributed systems papers under "Meta Lists": https://github.com/theanalyst/awesome-distributed-systems#me...
In reviewing this awesome list, today I learned about this playlist: "MIT 6.824 Distributed Systems (Spring 2020)" https://youtube.com/playlist?list=PLrw6a1wE39_tb2fErI4-WkMbs...
> awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata
Low-Cost Multi-touch Whiteboard using the Wiimote (2007) [video]
"Interactive whiteboard" / "smart board" https://en.wikipedia.org/wiki/Interactive_whiteboard
Wii Remote > Features > Sensing: https://en.wikipedia.org/wiki/Wii_Remote#Sensing
.. > Third-Party Development describes a number of applications for IR/optical tracking with an array of nonstationary emitters: https://en.wikipedia.org/wiki/Wii_Remote#Third-party_develop...
Augmented Reality (AR) > Technology > Tracking: https://en.wikipedia.org/wiki/Augmented_reality#Tracking
... links to "VR positional tracking" which does have headings for "Optical" and "Sensor fusion": https://en.wikipedia.org/wiki/VR_positional_tracking
How to Efficiently Choose the Right Database for Your Applications
I kinda disagree with separate branch for "document database" for Mongo. Mongo is a key-value storage, with a thin wrapper that converts BSON<->JSON, and indices on subfields.
You can achieve exactly the same thing with PostgreSQL tables with two columns (key JSONB PRIMARY KEY, value JSONB), including indices on subfields. With way more other functionality and support options.
> You can achieve exactly the same thing with PostgreSQL tables with two columns (key JSONB PRIMARY KEY, value JSONB), including indices on subfields. With way more other functionality and support options.
PostgreSQL docs > "JSON Functions and Operators" https://www.postgresql.org/docs/current/functions-json.html
MongoDB can do jsonSchema:
> Document Validator¶ You can use $jsonSchema in a document validator to enforce the specified schema on insert and update operations:
db.createCollection( <collection>, { validator: { $jsonSchema: <schema> } } )
db.runCommand( { collMod: <collection>, validator:{ $jsonSchema: <schema> } } )
https://docs.mongodb.com/manual/reference/operator/query/jso...Looks like there are at least 2 ways to handle JSONschema with Postgres: https://stackoverflow.com/questions/22228525/json-schema-val... ; neither of which are written in e.g. Rust or Go.
Is there a good way to handle JSON-LD (JSON Linked Data) with Postgres yet?
There are probably 10 comparisons of triple stores with rule inference slash reasoning on data ingress and/or egress.
A Data Pipeline Is a Materialized View
Sometime when I am old(er) and (somehow?) have more time, I'd like to jot down a "Rosetta Stone" of which buzzwords map to the same concepts. So often we change our vocabulary every decade without changing what we're really talking about.
Things started out in a scholarly vein, but the rush of commerce hasn't allowed much time to think where we're going. — James Thornton, Considerations in computer design (1963)
Like a Linked Data thesaurus with typed, reified edges between nodes/concepts/class_instances?
Here's the WordNet RDF Linked Data for "jargon"; like the "Jargon File": http://wordnet-rdf.princeton.edu/lemma/jargon
A Semantic MediaWiki Thesaurus? https://en.wikipedia.org/wiki/Semantic_MediaWiki :
> Semantic MediaWiki (SMW) is an extension to MediaWiki that allows for annotating semantic data within wiki pages, thus turning a wiki that incorporates the extension into a semantic wiki. Data that has been encoded can be used in semantic searches, used for aggregation of pages, displayed in formats like maps, calendars and graphs, and exported to the outside world via formats like RDF and CSV.
Google Books NGram viewer has "word phrase" term occurrence data by year, from books: https://books.google.com/ngrams
There’s no such thing as “a startup within a big company”
When I was at PowerBI in Microsoft, all the execs hailed it as Startup within Microsoft. Come work here instead of Uber. I worked like a dog, sometimes till 2am in morning. My manager would routinely ask us to come on weekends. I was naive, I thought we are growing customer base, this is what a startup looks like.
The ultimate realization was in a startup you have equity, a decent amount in a good startup. At Microsoft it was a base salary and set amount of stock. What we did moved very little of the top revenue metric. It made little difference if I worked like a dog, or slacked. The promos were very much “buddy buddy” system.
In the end I realized you can’t have startups in big companies (esp as an engineer you don’t have the huge upside if the startup is successful, your upside is capped)
Startups work because you have skin in the game, when you build something people want, you get to reap rewards proportional to it. That correlation and feedback loop is very important.
At big companies you don’t have the the same correlation. Some big shot exec they hired reaps far far more on the work you did.
Equity is what builds wealth.
>The ultimate realization was in a startup you have equity, a decent amount in a good startup. At Microsoft it was a base salary and set amount of stock.
I joined a startup in 1999. There were 3 founders and I was employee #2 after that. I received a ton of options (this was before RSUs became popular). We had a great product and a great team, but 18 months later ran out of money and unfortunately it was right after the dotcom implosion of early 2001 when funding had completely dried up.
My takeaway was that base salary is actually the most important component of TC. Cash bonus based on some metric that you control comes second. Equity comes third. If you work for a FAANG, maybe equity can move higher up (though it remains to be seen how long this will be true).
Outside of FAANG (and top executives at F500 sized public companies) very few people are getting rich off of the "equity" component of their TC. The vast majority of startups go bust before IPO or acquisition.
Time is the most valuable commodity you have, don't squander it for lottery tickets and empty promises.
>Time is the most valuable commodity you have, don't squander it for lottery tickets and empty promises.
That goes the other way around too, the only way to have massive amounts of free time without going FIRE is to win the lottery. So why not throw the dice once or twice when you're young and then settle into a stable corporate job if it doesn't pay out?
Modern Silicon Valley lets you do both. Why not throw the dice and still make around $200k cash?
As a [good] engineer you can have your cake and eat it too. Sure it won’t make you a billionaire but you can live a comfy life, maybe win the lottery, and retire at 50 if the lottery fails.
Living and working elsewhere with the wages of the region reduces expenses and opportunities; but the wealth of educational resources online [1][2] does make it feasible to even bootstrap a company on the side. Do you need to borrow money to scale quickly enough to pay expenses with sufficient cash flow for the foreseeable future?
Income sources: Passive income, Content, Equity that's potentially worth nothing, a backtested diversified portfolio (Golden Butterfly or All Weather Portfolio and why?) of sustainable investments, Business models [3]; Software implementations of solutions to businesses, organizations, and/or consumers' opportunities
Single-payer / Universal Healthcare is a looming family expense for many entrepreneurs; many of whom do get into entrepreneurship later in life.
Small businesses make up a significant portion of GDP. Small businesses have to have to accept risk.
There's still opportunity in the world.
[1] Startup School > Curriculum https://www.startupschool.org/curriculum
[2] https://www.ycombinator.com/library
[3] "Business models based on the compiled list at [HN]" https://gist.github.com/ndarville/4295324
From "Why companies lose their best innovators (2019)" https://news.ycombinator.com/item?id=23887903 :
> "Intrapreneurial." What does that even mean? The employee, within their specialized department, spends resources (time, money, equipment) on something that their superior managers have not allocated funding for because they want: (a) recognition; (b) job security; (c) to save resources such as time and money; (d) to work on something else instead of this wasteful process; (e) more money.
I’d you are going to start something on the side, carefully read your current employment agreement, particularly the part about IP assignment. Most Silicon Valley companies I’ve worked with, including FAANGs, claim ownership of everything you produce, inside or outside of employment, at home or in office, using their equipment or yours. You don’t want to lick into a unicorn idea and have your former employer’s lawyers send you that letter...
Unusual data point (I'm not sure if South African practice is applicable elsewhere) , but most bigger corps I've worked at have provided a mechanism to declare external interests, pending management approval, to work around these global clauses.
On paper any IP would be default belong to the Corp, but after some discussion/negotiations with your line manager and his higher-ups, you could do the paper work to declare.
Are these actually enforceable?
It probably doesn’t matter in practice. They will have more lawyers and more money to burn on legal process than you. Who will go bankrupt first fighting a legal battle between you and, say, Apple?
In practice, I don't think I've ever seen a SV example of that kind of litigation from garage development of IP. Maybe I've just missed them?
Theft of IP from one company to a new company on the other hand there are multiple public examples of.
The only time they'd actually sue is if you made something worth money. in that case, there's a good chance you can get investors to pay the lawsuit.
It does matter in practice in California.
If its "related" to your employers business yes - unless you agree a variation of the contract.
Ask HN: Keyrings: per-package/repo; commit, merge, and release keyrings?
Are there existing specs for specifying per-package release keyrings and per-repo commit and merge keyrings?
Keyring: a collection of keys imported into a datastore with review.
DevOpsSec; Software Supply Chain Security
Packages {X, Y, Z} in Indexes {A, B, C} are artifacts that are output from Builds (on workstations or servers with security policies) which build a build script (which is often deliberately not specified with a complete programming language in order to minimize build complexity; instead preferring YAML) which should be drawn from a stable commit hash in a Repository (which may be a copy of technically zero or more branches of a Repository hosted centrally next to Issues and Build logs and Build artifact Signing Keys).
Maxmimally, are there potentially more keyrings (or key authorization mappings between key and permission) than (1) commit; (2) merge; and (3) release?
Source Projects: Commit, Merge, [Run Build, Login to post-build env], Release (and Sign) package
Downstream Distros: Commit, Merge, [Run Build, Login to post-build env], Release (and Sign) package for the {testing, stable, security} (Signed) Index catalogs
Threat Actors Now Target Docker via Container Escape Features
It may be quipped that Docker or Kubernetes is remote code execution (RCE) as a service!
The vulnerability pointed out here is server operators who have exposed Docker's API to the public internet, allowing anyone to run a container. The use of a privileged container is just icing on the cake, and probably not nessecary for a cryptominer.
A more subtle and interesting attack vector is genuine container imags that become compromised, much like any other package manager, particularly ones that search for the Docker API on the host's internal IP address, hoping only public network access is firewalled.
Docker engine docs > "Protect the Docker daemon socket" https://docs.docker.com/engine/security/protect-access/
dev-sec/cis-docker-benchmark /controls: https://github.com/dev-sec/cis-docker-benchmark/tree/master/...
Eh. This advice is less practical than it’s made to seem. Like it “works” but it’s not really usable for anything other than connecting two privileged apps over a hostile network.
* Docker doesn’t support CRLs so any compromised cert means reissuing everyone’s cert.
* Docker’s permissions are all or nothing without a plug-in. And if you’re going that route the plug-in probably has better authentication.
* Docker’s check is just “is the cert signed by the CA” so you have to do one CA per machine / group homogeneous machines.
* You either get access to the socket or not with no concept of users so you get zero auditing.
* Using SSH as transport helps but then you have to also lock down SSH which isn’t impossible but more work and surface area to cover than feels necessary. Also since your access is still via the Unix socket it’s all or non permissions again.*
django-ca is one way to manage a PKI including ACMEv2, OCSP, and a CRL (Certificate Revocation) list: https://github.com/mathiasertl/django-ca
"How can I verify client certificates against a CRL in Golang?" mentions a bit about crypto/tls and one position on CRLs: https://stackoverflow.com/questions/37058322/how-can-i-verif...
CT (Certificate Transparency) is another approach to validating certs wherein x.509 cert logs are written to a consistent, available blockchain (or in e.g. google/trillian, a centralized db where one party has root and backup responsibilities also with Merkle hashes for verifying data integrity). https://certificate.transparency.dev/ https://github.com/google/trillian
Does docker ever make the docker socket available over the network, over an un-firewalled port by default? Docker Swarm is one config where the docker socket is configured to be available over TLS.
Docker Swarm docs > "Manage swarm security with public key infrastructure (PKI)" https://docs.docker.com/engine/swarm/how-swarm-mode-works/pk... :
> Run `docker swarm ca --rotate` to generate a new CA certificate and key. If you prefer, you can pass the --ca-cert and --external-ca flags to specify the root certificate and to use a root CA external to the swarm. Alternately, you can pass the --ca-cert and --ca-key flags to specify the exact certificate and key you would like the swarm to use.
Docker ("moby") and podman v3 socket security could be improved:
> From "ENH,SEC: Create additional sockets with limited permissions" https://github.com/moby/moby/issues/38879 ::
> > An example use case: securing the Traefik docker driver:
> > - "Docker integration: Exposing Docker socket to Traefik container is a serious security risk" https://github.com/traefik/traefik/issues/4174#issuecomment-...
> > > It seems it only require (read) operations : ServerVersion, ContainerList, ContainerInspect, ServiceList, NetworkList, TaskList & Events.
> > - https://github.com/liquidat/ansible-role-traefik
> > > This role does exactly that: it launches two containers, a traefik one and another to securely provide limited access to the docker socket. It also provides the necessary configuration.
> > - ["What could docker do to make it easier to do this correctly?"] https://github.com/Tecnativa/docker-socket-proxy/issues/13
> > - [docker-socket-proxy] Creates a HAproxy container that proxies limited access to the [docker] socket
This looks like the attacker is just using a publicly exposed Docker API honeypot to run a new Docker container with `privileged: true`. I don't see why that's particularly interesting given that they could just bind-mount the host / filesystem with `pid: host` and do pretty much whatever they want?
It would be much more interesting to see a remote-code-execution attack against some vulnerable service deploying a container escape payload to escalate privileges from the Docker container to the host.
Exposed on the WAN is obviously bad, but how do you keep your own containers from calling those APIs? Yes, you don't mount the docker socket in the container, but what about the orchestrator APIs?
With kubernetes, you enable RBAC, and only allow each pod the absolute minimum of access required.
In my own setup, I have only five pods with any privileges: kubernetes-dashboard, nginx-ingress, prometheus, cert-manager, and the gitlab-runner, and of those only kubernetes-dashboard can modify existing pods in all spaces, cert-manager can only write secrets of its own custom type, and gitlab-runner can spawn non-privileged pods in its own namespace (but not mount anything from the host, nor any resources from any other namespace).
And when building docker images in the CI, I use google’s kaniko to build docker images from within docker without any privileges (it unpacks docker images for building and runs them inside the existing container, basically just chroot).
All these APIs are marked very clearly and obviously as DO NOT MAKE AVAILABLE PUBLICLY. If you still make them public, well it’s pretty much your own fault.
Does RBAC not limit these by default? Does cert-manger not already give itself restricted permission on install? Do I need to fix up my cluster right now? If so, do you have any example RBAC yamls? :D
> And when building docker images in the CI, I use google’s kaniko to build docker images from within docker without any privileges (it unpacks docker images for building and runs them inside the existing container, basically just chroot).
You can also use standalone buildkit which comes with the added benefit of being able to use the same builder locally natively.
No RBAC doesn't automatically do this. And many publicly available Helm charts are missing these basic security configurations. You should use Gatekeeper or similar to enforce these settings throughout your cluster.
Gatekeeper docs: https://open-policy-agent.github.io/gatekeeper/website/docs/
Gatekeeper src: https://github.com/open-policy-agent/gatekeeper
awesome-container-security: https://github.com/kai5263499/awesome-container-security
"container-security" GitHub label: https://github.com/topics/container-security
In somewhat related news, rootless Docker support is no longer an experimental feature. Has anyone used it and liked/disliked it?
I have, and it's great/you forget about it pretty quickly. I've actually gone (rootful?)docker->(rootless)podman->(rootless)docker, and I've written about it:
- https://vadosware.io/post/rootless-containers-in-2020-on-arc...
- https://vadosware.io/post/bits-and-bobs-with-podman/
- https://vadosware.io/post/back-to-docker-after-issues-with-p...
I still somewhat long after podman's lack of a daemon but some of the other issues with it currently leave it completely out for me (now that docker has caught up) -- for example lack of linking support.
Podman works great outside of those issues though, you just have to be careful if it does something in a way that expects docker semantics (usually the daemon or an unix socket on disk somewhere). You'll also find some cutting edge tools like dive[0] also have some support for podman[1] but it's not necessarily the usual case.
podman v3 has a docker-compose compatible socket. From https://news.ycombinator.com/item?id=26107022 :
> "Using Podman and Docker Compose" https://podman.io/blogs/2021/01/11/podman-compose.html
Ask HN: What security is in place for bank-to-bank EFT?
When I set up an ETF on a bank's website, all I need to enter is the other bank's routing number and account number, which can be readily found on a paper check. Then you can transfer money from one bank to another... What security and authentication is in place to prevent fraud? In case of fraud, is the victim guaranteed to get the money back?
AFAIU, no existing banking transaction systems require the receiver to confirm in order to receive a funds transfer.
You can create a "multisig" DLT smart contract that requires multiple parties' signatures before the [optionally escrowed] funds are actually transferred.
EFT: Electronic Funds Transfer: https://en.wikipedia.org/wiki/Electronic_funds_transfer
As far as permissions to write to the account ledger: Check signatures are scanned. Cryptoasset keys are very long, high-entropy "passwords". US debit cards are chip+pin; it's not enough to just copy down the card number (and CVV code).
Though credit cards typically are covered by fraud protection, debit card transactions typically aren't: hopefully something will be recovered, but AFAIU debit txs might as well be as unreversible as cryptoasset transactions.
TPL: Transaction Permission Layer is one proposed system for permissions in blockchain; so that e.g. {proof of residence, receiver confirmation, accredited investor status, etc.} can be necessary for a transaction to go through.
ILP: Interledger Protocol > RFC 32 > "Peering, Clearing and Settling" describes how ~EFT with Interledger works: https://interledger.org/rfcs/0032-peering-clearing-settlemen...
Podman: A Daemonless Container Engine
Is the title of this page out of date?
AFAIU, Podman v3 has a docker-compose compatible socket and there's a daemon; so "Daemonless Container Engine" is no longer accurate.
"Using Podman and Docker Compose" https://podman.io/blogs/2021/01/11/podman-compose.html
Podman does not run in a daemon-mode by default. There's a dbus-activated service unit that activates the daemon when you try to use Docker Compose with it.
It is almost 100% compatible with docker. I only had to do minimal changes in my Dockerfiles to use it.
But not Buildkit and docker-compose.
Podman v3 is compatible with docker-compose (but not yet swarm mode, FWIU), has a socket and a daemon that services it.
Buildah (`podman buildx`, `buildah bud --arch arm64`) just gained multiarch build support; so also building arm64 containers from the same Dockerfile is easy now. https://github.com/containers/buildah/issues/1590
IDK what BuildKit features should be added to Buildah, too?
> IDK what BuildKit features should be added to Buildah, too?
Cache mounts are what we rely on for incremental rebuilds.
Cambridge Bitcoin Electricity Consumption Index
Ok, I have to admit that I am one of those people who knew about BTC very, very early but never bought any. Now, of course I should have mined at least 100 back in the days, but anyways.
The reason I have never mined nor bought into it is that I still, to this day struggle to come up with a real reason to do so.
Am I getting too old? Do I not see the "huge potential" what it may become? Don't I want to get freaking rich? I simply can't answer what BTCs and other crypto coins are actually good for. Can someone help me out here?
I'm in your boat, when it was first explained to me in a hacker space at college when it was first growing and priced at a few cents per 100++ Bitcoin. All I could see was a waste of energy and resources (time, energy, hardware).
It will always be a drain on society, and it scares me that it hasn't failed because we see it draining our GPU industry to the point that market prices have gone insane and availability is practically non-existent / all left up to chance.
It does take a rocket science to see the trends, as long as it's tied to needing hardware to compute arbitrary calculations to work... It's going to waste computer resources and energy. Both which are finite resources... Yes finite, solar cells aren't free to make, they don't last forever after you make them. Same with any other green technology. Maybe if fusion was our only power source it would be less of a concern, but we still haven't cracked Q=1, so... Yes it is a waste of finite resources.
It’s not draining the GPU industry. No one buys GPUs for mining Bitcoin anymore, they use ASICs.
Even as just ASICs there's something of a drain, Bitcoin ASICs consume fab production which means less remaining production capacity for GPUs.
This is an equilibrium that should eventually fix itself, but while Bitcoin and crypto mining is growing they are making a material dent in the global semiconductor economy.
Cryptoasset mining creates demand for custom chip fab (how different are mining rigs from SSL/TLS accelerator expansion cards), which is definitely not zero sum: more revenue = more opportunities.
https://en.wikipedia.org/wiki/Price_elasticity_of_supply
With insufficient demand, a market does not develop into a sustainable market. "Rule of three (economics)" says that markets are stable with 3 major competitors and many smaller competitors; nonlinearity and game theory.
https://en.wikipedia.org/wiki/Rule_of_three_(economics)
We've always had custom chip fab, but the prices used to be much higher. Proof of Work (and Proof of Research) incentivize microchip and software energy efficiency; whereas we had observed and been most concerned with doublings in transistor density.
FWIU, it's now more sustainable and profitable to mine rare earth elements from recycled electronics than actually digging real value out of the earth?
Compared to creating real value by digging for gold, how do we value financial services?
Bitcoin's fundamental value is negative given its environmental impact
If the price of energy is calculated with corresponding carbon tax included, shouldn't Bitcoin be neutral?
> If the price of energy is calculated with corresponding carbon tax included, shouldn't Bitcoin be neutral?
Yes, but there are vastly more energy efficient substitute DLTs with near-zero switching costs. Litecoin and scrypt (instead of AES256), for example.
Apply a USD/kWhr threshold across all industries.
Is this change (and focus on the external costs of energy production) more the result of penalties or incentives?
Pre-mined coins are vastly more energy efficient (with tx costs <1¢ and similarly minimal kWhr/tx costs), but the market doesn't trust undefined escrow terms that are fair game in commodities and retail markets.
We have trouble otherwise storing energy from noon to commute and dinner time; whereas a commodity like grain may keep for quite awhile.
Bitcoin serves as a demand subsidy when heavily-subsidized energy prices crash due to oversupply (that we should recognize as temporary because we are moving to electric vehicles and we need to reach production volumes so that, in comparison to alternatives, renewables are now more cost effective)
In the US, we have neither carbon taxes nor intraday prices. The EU has carbon taxes and electrical energy markets.
>In the US, we have neither carbon taxes nor intraday prices.
Like, no intraday electricity market?
Ask HN: What are some books where the reader learns by building projects?
2021 Edition. This is a continuation of the previous two threads which can be found here:
https://news.ycombinator.com/item?id=22299180
https://news.ycombinator.com/item?id=13660086
Other resources:
https://github.com/danistefanovic/build-your-own-x
https://github.com/AlgoryL/Projects-from-Scratch
https://github.com/tuvtran/project-based-learning
"Agile Web Development with Rails [6]" (2020) teaches TDD and agile in conjunction with a DRY, CoC, RAD web application framework: https://g.co/kgs/GNqnWV
Is it wrong to demand features in open-source projects?
Yes, it's wrong to demand something for nothing: that's entitlement, not business (which involves some sort of equitable exchange of goods and/or services).
Better questions: How do I file a BUG report issue, create a feature ENHancement request issue, send a pull request with a typo fix, write DOCs and send a PR, write test cases for a bug report?
How can I sponsor development of a feature?
A project may define a `.github/FUNDING.yml`, which GitHub will display on the 'Sponsor' tab of the GitHub project. A project may also or instead include funding information in their /README.md.
How do I ask IRC or a mailing list or issues how much and how long it would cost to develop a feature, if somebody had some international stablecoin and a limited term agreement?
The answer may be something like, "thanks for the detailed use case or user story, that's on our roadmap, there are major issues blocking similar features and that's where the expense would be."
CompilerGym: A toolkit for reinforcement learning for compiler optimization
What a great idea. Would per-instruction/opcode costs make this easier to optimize?
Are MuZero or the OpenAI baselines useful for RL compiler optimization?
Turning desalination waste into a useful resource
Tangentially related: there are millions of abandoned oil wells emitting greenhouse gases like methane across the US. Vice had a great video about this last month, but it didn't get much attention here on HN. [1] [2]
Evcxr: A Rust REPL and Jupyter Kernel
I have been using evcxr for the last 2-3 weeks and I'm loving it. There are two things that bother me: (a) fancy UI components seem to only work with Python and C++ kernel (not theoretically, but with available packages), (b) while you can redefine values and functions when you're iterating, you cannot do so with structs (yes, big structs go to a local package that I depend on, but sometimes I have ad-hoc utility structs).
Here's the xeus-cling (Jupyter C++ Kernel) source: https://github.com/jupyter-xeus/xeus-cling/tree/master/src
Do any of the other non-Python Jupyter kernels have examples of working fancy UI components? https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
Jupyter kernels implement the Jupyter kernel message spec. Introspection, Completion: https://jupyter-client.readthedocs.io/en/latest/messaging.ht...
Debugging (w/ DAP: Debug Adapter Protocol) https://jupyter-client.readthedocs.io/en/latest/messaging.ht...
A `display_data` Jupyter kernel message includes a `data` key with a dict value: "The data dict contains key/value pairs, where the keys are MIME types and the values are the raw data of the representation in that format." https://jupyter-client.readthedocs.io/en/latest/messaging.ht...
This looks like it does something with MIME bundles: https://github.com/jupyter-xeus/xeus-cling/blob/00b1fa69d17b...
ipython.display: https://github.com/ipython/ipython/blob/master/IPython/displ...
ipython.core.display: https://github.com/ipython/ipython/blob/master/IPython/core/...
ipython.lib.display: https://github.com/ipython/ipython/blob/master/IPython/lib/d...
You can also run Jupyter kernels in a shell with jupyter/jupyter_console:
pip install jupyter-console jupyter-client
jupyter kernelspec list
jupyter console --kernel python3Ask HN: What is the cost to launch a SaaS business MVP
I interviewed several entrepreneurs, I noticed several spent so much money on developing their product before launch. What is your experience?
When you're not yet paying yourself, your costs are your living costs and opportunity costs (in addition to the given fixed and variable dev and prod deployment cloud costs).
Early feedback from actual customers on an MVP can save lots of development time. GitLab Service Desk is one way to handle emails as issues from users who don't have GitLab accounts.
A beta invite program / mailing list signup page costs very little to set up; you can start building your funnel while you're developing the product.
Cryptocurreny crime is way ahead of regulators and law enforcement
I used to be surprised when I would read articles about this mythical legislative process which can and will protect lowly plebians like me. I'm not sure why so many people are under the impression legislation will be able to keep up with perversely motivated individuals or groups. Who are these legislators with motives so pure that they will never be bribed to only look out for my interests? Let's never forget there is only one person in jail for the 2008 crash this article is trying to convince us was the result of too little regulation. There was never enough incentive to protect us from the people who created that market crash.
I think the problem with 2008 was that no one was doing anything illegal, just a lot of poor judgement and lack of regulations.
Bitcoin was created by people who were so peeved at the regulator's interventions in the '08 crisis that they decided that bypassing the entire financial system was the best option. Forever immortalising "The Times 03/Jan/2009 Chancellor on brink of second bailout for banks" just as a reminder of why they were doing what they did [0].
We don't need better regulation. We need, when financial firms muck up on a global scale, to replace the people who run those firms with different people. There is an easy way to do that, let them go broke.
The biggest problem in the room is that the regulators have bought in to the idea that some companies are "too big to fail". That is a stupid idea, fundamentally unfair, completely throwing out the best part of capitalism which is that recognised idiots aren't allowed to be in charge. That problem will certainly not be solved by giving the regulators more power.
Bitcoin was created in context to "Transparency and Accountability": a campaign motto not coincidentally found in the title of the "Federal Funding Accountability and Transparency Act of 2006".
> The Federal Funding Accountability and Transparency Act of 2006 (S. 2590)[2] is an Act of Congress that requires the full disclosure to the public of all entities or organizations receiving federal funds beginning in fiscal year (FY) 2007. The website USAspending.gov opened in December 2007 as a result of the act
Sen. Obama's office is the origin of this bill; which was fronted by Sens Coburn and McCain, who had the clout.
https://usaspending.gov/ creates a mandatory database with budgetary line item metadata. Where money actually goes is something that is far more transparent and accountable with bitcoin and other public ledgers than any existing ledger covered by bank secrecy laws.
For context, in 2008-09, global financial systems were failing as a result of the American economy: housing bubble burst, HFT "flash crash" that we didn't have CAT or big data tools to determine the cause of, DDOS attacks and cyber security losses increasing YoY, credit default swaps had been rated as AAA securities (they sold bundled bad debt like it was worth something, and then wrote down losses), Enron energy speculation amidst rolling blackouts that were leaving hospitals in the dark, on gas generators, government investments in renewables had been paltry since the Carter administration had put solar panels on the roof of the White House before the whole oil price shock, and oil commodity speculation had driven the price of oil to like 2-4x the 2000 price (with resultant price effects on most CPI inflation/PPP basket goods); but electricity consumption was down in 2008 and renewables hadn't reached production volumes necessary to reach the competitive price point that renewables now present: cheaper than nonrenewables.
Who would have thought that the speculative price would continue to exceed the production cost. incentives or penalties?
Externalities per dollar returned per kWh is one way to assess the total costs of electricity production methods.
"Buy Gold" was the refrain of the day: TV commercials, signs out in front of piano stores (a somewhat-arbitrary commodity, sales of which are observed to be a leading indicator of economic health), signs on the road. And the message was "take your money out the market and put it in gold" which drives up the prices for chips and boards and medical equipment that rely upon that commodity as a material input. Gold is necessary for tropical spec components in high-humidity environments: gold hinges are prized, for example.
But, "look, there's water flowing from the chocolate fountain; so you can go ahead and go" and "you know you want to put it back in there, in that market" we're the appropriate messages given our revenue at the time.
For further technology scene context in 2007-2009,
"Grid computing" links to a number of distributed computing projects: https://en.wikipedia.org/wiki/Grid_computing#History
IIRC, there was a production metric-priced grid system developed around Seattle/Vancouver called "Gold" (?) that was built on Xen and is likely a precursor to metric-priced Cloud services like EC2 and S3 (which now simplify calculations for how much a 51% attack against a Proof of Work txpool with adaptive tx fees costs with n good participants in the game) which incentivize efficiency by penalizing expensive operations.
Code bloat was already a thing: how is everything getting slower when Moore's Law predicts the growth rate in transistor density? Are there sufficient incentives for code efficiency when there seem to be surplus compute resources just idly depreciating.
MySQL primary/secondary replication was considered a viable distributed database system, but securing replication depends upon cert exchange and (optionally), PKI, DNS, and IP tunnels of some sort. And then who has root, write to the journal and tables and indexes on the filesystem, UPDATE, and DELETE access in an inter-organizational distributed systems architecture with XML, Web Services, our very own ESB to scale separately from the database replication and off-site backups that nobody ever checks against the online data, and fragmented and varyingly-implemented industry standards that hopefully specify at least a sufficient key for the record that's unique across ledgers/systems/databases.
BitTorrent DHT magnet: links were extant.
Linden Dollars in Linden Lab's Second Life (there's a price floor on land, which is necessary to sell digital assets/goods/products/services) and accumulated avatar value in e.g. EverQuest and WarCraft (for which there were secondhand markets).
ACH was ACH: GPG-signed files over SFTP on the honor of the audited bank to not allow transfers that deposit money that doesn't exist.
There was no common struct for banking APIs (as apparently only e.g. Plaid, Quicken, and Mint solve for): ledger transactions have a fixed width text field that may contain multiple fields concatenated into one string, and there's no "payee URI" column in the QIF or CSV dumps of an account ledger.
To request more than e.g. the past 90 days of one's own checking account ledger, one was expected to parse tables out of per-month PDFs with e.g. PDFminer at $20 apiece, and then think up ones own natural key in order to merge and lookup records because (2008-01-01,3.99) and (2008-01-01,3.99,storename) are indistinct as a natural key (and when hashed). If you loan a your bank money (for them to now freely invest in the other side, since GLBA in 1999), wouldn't you think that the least they can do is give you `SELECT * WHERE account_id=?` as a free CSV without any datetime limitation in regards to what's offline and what's online.
"Audit the Fed", "Audit DoD" were being chanted by economically-aware citizens amidst severe correction and what was then the most severe recession since the Great Depression: the "Great Recession" it was called, and payouts to essential cronies (who hadn't saved wheat for the famine) were essential.
Overdraft was an error charged to the customer, who didn't build an inconsistent system (CAP theorem) that allows spending money that doesn't exist (at interest charged to the consumer/taxpayer).
"Catch Me If You Can" (2002) described the controls for bank fraud at the time. Why are fees so high?
"Office Space" (1999) described penny-shabing / salami-slicing attack: "fractions of a penny".
"Beverly Hills Ninja" (1997) detailed the story of the Great White Ninja and Tanley! (fistpalm)
"Swordfish" (1999) described a domestic disaster and bank transfers confirmable in seconds.
Ask HN: Why aren't micropayments a thing?
Amazon aws and related services can charge you a rate per email, or per unit time of computation, so why can't news sites just charge you $0.01 to read an article, or even half that?
For reference, there's Web Monetization [1] which tries to solve exactly that.
As others have noted it all boils down to user agent support. Otherwise most publishers probably won't consider giving up ad or subscription financed models.
Also, forcing users into subscriptions allows for better demographics data/statistics.
https://webmonetization.org/ lists Coil (flat $5/mo) as the first Web Monetization provider: https://coil.com/
Web Monetization builds upon ILP (Interledger Protocol), which is designed to work with any type of ledger; though it's probably not possible for any traditional ledger to beat the <1¢ transaction fee that only pre-mined coins have been able to achieve.
I really like their approach and I paid for Flattr for quite a while when it started. But something big needs to push them to the critical adoption rate. On a number of very techy blog posts, I got... 0 from WebMonetization. And that's on best case audience (coming from HN and tech Reddit).
Elon Musk announces $100M carbon capture prize
https://www.xprize.org/prizes/carbon :
> The $20M NRG COSIA Carbon XPRIZE develops breakthrough technologies to convert CO₂ emissions into usable products.
CCS: https://en.wikipedia.org/wiki/Carbon_capture_and_storage
CCU: https://en.wikipedia.org/wiki/Carbon_capture_and_utilization
Sequestration: https://en.wikipedia.org/wiki/Carbon_sequestration
Tim Berners-Lee wants to put people in control of their personal data
Interesting in a technological sense, but what problem it solves isn’t obvious to me. It lets me granularly authorize first party access what data I have in my pod, but there can’t be any technical guarantees with regards to illegitimate sharing or otherwise copying (many might at least cache, for instance) – nor about what is collected and shared outside this system.
I keep seeing data-hubs and identity-providers touting themselves as solutions to the web's privacy issues, but I don't see how they actually solve anything.
It seems like an attempted technical solution to a social problem to me.
The real problem with data based services (ads, Google search, etc) is really that a bunch of data is collected opaquely, unethically, and in some cases illegally. The whole system including data brokers and real time bidding is out of control.
In my (personal) view, it's the technical part of a solution that definitely also needs to have a social/legislative component. It cannot prevent parties from illegitimate sharing of my data, but it does give them the option to hand over control to me. There are lots of companies that currently hold data on us but for whom that data is not their primary competency, and they only need a small nudge (like GDPR) to make having the customer responsible for that data an attractive proposition.
You might be right, but I think it's disingenuous to market it as though this "solves privacy". Worst case, people are lulled into a false sense of security.
Data-storage + authorization doesn't solve any (new) technical privacy-issues; this is "data protection" rather than "data privacy" in my book.
While I recognize the value of W3C LDP and SOLID, I also fail to see anything in SOLID that prevents B from sharing A's now pod-siloed information.
Does it prevent screenshots and OCR?
So it's in standard record structs and that makes it harder for the bad guys?
Who moderates mean memes with my face on them?
It is my hope that future Linked Data spec tutorials model something benign like shapes or cells instead of people: so that we can still see the value.
Laws still exist against things like perjury, even though the existence of the law is not a technical means in itself able to prevent perjury. Note that one of the comments upthread specifically mentioned legislation. The current notion that many people in the tech world have, which roughly states that what determines whether something is kosher is whether it's technically possible to accomplish, is something that needs to change, instead of things just staying a permanent Wild West forever.
There's also an old phrase that putting locks on your doors doesn't actually stop a determined attacker, but that it's okay because they're not meant to—that they're meant to "keep honest people honest". It's a principle that applies here.
No, there are few to no actual privacy improvements over centralized systems.
Perhaps even functional regression: what, are you going to run a hash blocklist across all nodes? Like spamhaus? Is there logging or user accounting? Is anything chain of custody admissable, or are we actually talking about privacy and liberty here?
Is everything just marked, "not for unlimited distribution"? And we dwpend upon there not being bad actors?
Real costs are very different with just friendly early adopters.
Cryptographically signing posts (with LD-Signatures) may help with integrity, but that can be done with centralized systems and does nothing to help with confidentiality.
What about availability? Is it a trivially-DOS'able system?
Governments spurred the rise of solar power
I can't read the article, but I think evidence like this is important when it comes to the whole idea of government subsidises impacting on free markets. The free market has become a bit of a religion for some (you can see in the comments here), and there's this staunch divide between those that believe the free market will always come up with a humanitarian solution, to those that believe the government need to push things for it to happen.
My FIL was head of a large coal plant in the UK, and is very critical of renewable energies, their cost, the impact on the fossil fuel industry etc... It simply isn't possible to have a conversation with him about global warming - he will simultaneously argue it isn't happening and it's natural and not caused by us. He might be an extreme example, but there is general scepticism of government subsidies here.
I'd love to see a study where researchers take things that are seen as good/important/essential to modern life and measure the amount of public/government sponsorship that helped bring it about.
I always find the free market vs government arguments a false dichotomy. Even the free-est markets require a whole host of government services: regulation, standards setting, contracts enforcement, antifraud measures provision and infrastructure to name but a few. People only cry free market when they don't like a particular government activity.
That coal plant relies on infrastructure to get its coal. It needs fair utility prices to deliver its power to customers. It needs reliable (abuse free) markets to sell the power on. It needs voltage and frequency/time standards to access the network. It needs contract enforcement to actually get paid by its customers. It needs insurance and educated workers and a CEO who won't embezzle.
All provided by government, all welcomed by industry. But suddenly government has no place in free markets when it comes to CO2.
Should we prefer penalties or incentives in order to use predictable markets for the change we need?
That depends on whether you want power to be cheaper or more expensive. Cheaper is nice but you'll pay for it taxes etc. Expensive is painful at the point of consumption but encourages efficiency. It's a strategic and political decision and depends on a nations needs and objectives...
> I'd love to see a study where researchers take things that are seen as good/important/essential to modern life and measure the amount of public/government sponsorship that helped bring it about.
Essential technology investment of US tax dollars?
NASA spinoff technologies: https://en.wikipedia.org/wiki/NASA_spinoff_technologies
NSF, DARPA, IARPA, In-Q-Tel, ARPA-e (2009)
List of emerging technologies: https://en.wikipedia.org/wiki/List_of_emerging_technologies
Termux no longer updated on Google Play
The GitHub discussion is significantly more informative and carries a lot of thinking behind the changes: https://github.com/termux/termux-app/issues/1072
IMO a better link than a short paragraph on Wiki.
Note that there are 297 hidden items in that issue so you have to click "Load more..." ceil(297/60) times to read all of the comments about how APK packaging is soon necessary for latest Android devices so the termux package manager can't just dump executable binaries wherever.
FWIU:
- Android Q+ disallows exec() on anything in $HOME, which is where termux installed binaries that may have been writeable by the executing user.
- Binaries installed from APKs can be exec()'d, so termux must keep APK repacks rebuilt and uploaded to a play store.
- Termux shouldn't be installed from Google Play anymore: you should install termux from the F-Droid APK package repos, and it will install APKs instead of what it was doing.
- Compiling to WASM with e.g. emscripten or WASI was one considered alternative. "Emscripten & WASI & POSIX" https://github.com/emscripten-core/emscripten/issues/9479
What about development on-the-device?
- It seems C compiled with clang on the device wouldn't be executable? (If it was, that would be a way around the restriction: distribute packages as source, like the good old days)
> offer users the option of generating an apk wrapping their native code in a usable way. https://github.com/termux/termux-app/issues/1072#issuecommen...
This seems a promising solution: compile from source, create an apk, install - your custom distribution! For popular collections of packages, a pre-built apk.
- Java might be explicitly blocked, being a system language for android, even though its byecode is interpeted and not exec()ed.
- Other interpreted languages should be OK e.g. python
> > offer users the option of generating an apk wrapping their native code in a usable way.
> This seems a promising solution: compile from source, create an apk, install - your custom distribution! For popular collections of packages, a pre-built apk.
FPM could probably generate APKs in addition to the source archive and package types that it already supports.
The conda-forge package CI flow is excellent. There's a bot that sends a Pull Request to update the version number in the conda package feedstock meta.yml when it detects that e.g. there's a new version of the source package on e.g. PyPI. When a PR is merged, conda-forge builds on Win/Mac/Lin and publishes the package to the conda-forge package channel (`conda install -y -c conda-forge jupyterlab pandas`)
The Fedora GitOps package workflow is great too, but bugzilla isn't Markdown by default.
Keeping those APKs updated and rebuilt is work.
Ask HN: What should go in an Excel-to-Python equivalent of a couch-to-5k?
Yesterday, my co-founder published a blog about her experiences Ditching Excel for Python in her job as a Reinsurance Analyst [0].
One of the responses on reddit [1] asked what they should do, "Step 1 day 1," if having read Amy's post they were convinced to try and begin the long journey from tangled Excel/Access spaghetti.
My (flippant) reaction to a friend that brought the comment to my attention was unhelpful; "Step 1 day 1, quit." So he has challenged me to write eight helpful blog posts during the remainder of my Garden Leave.
What should go in them?
[0] https://amypeniston.com/ditching-excel-for-python/
How to write functions in JS / VB script and call them from a cell expression.
How to name variables something other than AB3.
How to use physical units and datatypes. (How to specify XSD datatype URIs that map to native primitives in an additional frozen header row. e.g. py-moneyed and Pint & NumPy ufuncs)
How transitive sort works (is there a tsort to determine what to calculate first (and whether there are cycles) on every modification event?)
Which Jupyter platforms do and don't support interactive charts with e.g. ipywidgets?
pandas.df.plot(kind=) (matplotlib), seaborne (what are the calculated parameters of this chart?), holoviews, plotly, altair
Reproducibility w/ repo2docker / BinderHub:
pip freeze > constraints.txt
cp constraints.txt requirements.txt
conda env export --from-historyAlso,
When is it better to have code in a notebook instead of in a module?
How to export notebook cells to a module with nbdev
How to write tests to assert the quality of the code and the model: @pytest.mark.parametrize, pytest-notebook, jupyter-pytest-2, pytest-jupyter
When is it appropriate to parametrized a notebook with e.g. papermill?
How to handle concurrency: dask.distributed + dask-labextension, ipyparallel
Interesting, your suggestion is that the first thing to do is to learn how to make functions and variable names within excel?
Yeah if you port it to functions and verify that you haven't broken anything, you could then easily port to Python functions that you could call from Excel with an add-on that everyone that opens the sheet needs to have installed; but everyone that opens a sheet that calls Python must have that same extension (and all python package dependencies) installed, too
>My (flippant) reaction to a friend that brought the comment to my attention was unhelpful; "Step 1 day 1, quit." So he has challenged me to write eight helpful blog posts during the remainder of my Garden Leave.
One way to do it would be to do impedance matching: maximizing usefulness by bringing Python to the universe the person who asked the question lives in: Excel, Word, PDF, email.
This means "Automate the Boring Stuff"[0] to ease into programming with something specific to do that is relevant and useful right off the bat for manipulating files, spreadsheets, PDF documents, etc.
Openpyxl[1] for playing more with Excel files, and basically searching "Excel Python" on the internet for more use cases.
Python-docx[2] for playing with Word documents, extracting content from tables, etc.
PdfPlumber[3] for playing with PDFs. Sometimes you have data in .pptx files, and you can use LibreOffice to convert all of them to PDFs because you hate the pptx format:
libreoffice --headless --invisible --convert-to pdf *.pptx
Also, if you both are using notebooks, take a look at what we're building at https://iko.ai. I posted about it a bit earlier[4]. It has no-setup real-time Jupyter notebooks, and all the jazz. Focuses on solving machine learning problems since we built it for ourselves to slash projects' time.- [0]: https://nostarch.com/automatestuff2
- [1]: https://openpyxl.readthedocs.io/en/stable/
- [2]: https://python-docx.readthedocs.io/en/latest/
Using openpyxl to extract data from exist excel workbooks and other documents would seem like a good place to start.
I suspect many companies use excel workbooks as "forms" with lots of data at the same cell in multiple workbooks.
> I suspect many companies use excel workbooks as "forms" with lots of data at the same cell in multiple workbooks.
Downstream data quality costs can be minimized with data normalized schema and data collection process controls like forms-based data validation.
There are established UI/UX design patterns for data validation of user-supplied data: accessible [web] forms with tab-ordered input fields and specific per-input feedback with accessible HTML5 and ARIA. IIUC, Firefox now supports PDF forms, too?
Why would we move from a spreadsheet to an actual database?
Data integrity:
Referential integrity (making sure that record keys actually point to something when creating, updating, or deleting),
Columnar datatypes (float, decimal, USD, complex fraction),
Access controls (auth(z): authentication and authorization),
Auditing (what was the value before and after that) and Disaster Recovery,
Organizationally-unified schema development and corresponding validation.
Repeatability / Reproducibility: can you replay the steps needed to build the whole sheet? What parameters were entered and how to we script that par so that we can easily assess the relations between the terms of the argument presented?
Scientists turn CO2 into jet fuel
Link to the article in question: https://www.nature.com/articles/s41467-020-20214-z
They call it 'renewable' yet use Iron as a catalyst. I'm not a chemist - how is that renewable?
Yao, B., Xiao, T., Makgae, O.A. et al. "Transforming carbon dioxide into jet fuel using an organic combustion-synthesized Fe-Mn-K catalyst." Nat Commun 11, 6395 (2020). https://doi.org/10.1038/s41467-020-20214-z
Synthetic biology can do it better. Fix CO2 by growing sugar cane, turn sugar into jet fuel with genetically engineered yeast.
https://www.total.com/media/news/press-releases/total-and-am...
Still not commercially viable, but much closer to it than the linked process.
That approach requires lots of land, water, and time. Nothing wrong with that, but much of the funding for CO2->jet fuel research comes from the military, which wants to be able to create liquid fuel quickly from CO2 in the atmosphere and they don't care how inefficient it is. Military customers are typically not very interested in processes that require running a farm.
The military also spends a lot of time turning jet fuel into electricity, because the former is energy-dense, easy to store, easy to transport. And they need electricity in lots of places without a grid.
Where would the reverse be useful? Somehow you have unlimited electricity but no fuel, and you have time and space to run a chemical refinery? (And isn't carbon capture from the atmosphere likely to require farm-scale infrastructure anyway?)
You cannot run aircraft on electricity; liquid fuel is the only option. The military is keenly interested in being able to refuel aircraft in situations where liquid fuel shipments might be denied or unavailable for some reason.
You can run aircraft on electricity.
Locomotives run on electrical energy produced by diesel generators (because electric motors are more energy efficient), for example.
The limits are the cost and weight of the batteries and the charge time.
You can't in any practical sense. Locomotives don't have a problem carrying the weight of a diesel engine plus a generator; in fact the extra weight works in their favor.
If you tried that trick with an airplane it would never get off the ground. Locomotives use this system not for efficiency but because they need the low-end torque electric motors provide. (If they only needed efficiency they'd just drive the wheels with the diesel engine directly.) Airplanes don't need torque; they need power.
You can run an airplane on batteries but only for a few minutes with modern battery technology. Practical battery-powered aircraft for military applications are not going to happen any time soon. We might see battery-powered air taxis soon but those are not even close to meeting military requirements for fighters, transports, etc.
From https://en.wikipedia.org/wiki/Electric_aircraft :
> [For] large passenger aircraft, an improvement of the energy density by a factor 20 compared to li-ion batteries would be required
The time it takes to surpass this energy density threshold is affected by battery tech investments; which had been comparatively paltry in terms of defense spending. Trillions on batteries would've been a much better investment; with ROI.
Sadly, some folks in defense still can't understand why non-oil investments in battery tech are best for all.
There are multiple electric trainer aircraft with flight times over an hour and quite a few more in development.
Jet engines are terribly inefficient (30-50% efficient) compared to electric motors.
Show HN: Stork: A customizable, WASM-powered full-text search plugin for the web
OP - congratulations on shipping! WASM-powered search caught my eye.
Looks like the search index is downloaded and used locally in the browser, so this is as fast as search can get. One trade-off though is that you're limited to relatively small datasets. While this shouldn't be an issue for small-medium static sites, an index that needs to be downloaded over the wire will affect your page performance for larger sites / datasets.
In those cases, you'd want to use a dedicated search server like Algolia or its open source alternative Typesense [1]. Both of them store the index server-side and only return the relevant records that are searched for.
For eg: you'd probably not want to download 2M records over the wire to search through them locally [2]. You'd be better off storing the index server-side.
[1] https://github.com/typesense/typesense (shameless plug)
Just a quick non-thought-through idea but would it be possible to build an index in a way that allows clients to download only parts of it based on what they search? I.e. the search client normalizes the query in some way and then requests only the relevant index pages. The index would probably be a lot larger on the server but if disk space is not an issue...?
(Though at some point you have to ask yourself what the benefits of such an approach are compared to a search server.)
Merkle Search Trees: Efficient State-Based CRDTs in Open Networks https://hal.inria.fr/hal-02303490/document
Peer-to-Peer Ordered Search Indexes https://0fps.net/2020/12/19/peer-to-peer-ordered-search-inde... (which adds useful context about the above)
Sorry for just dropping these links, I should already be asleep :)
> Merkle Search Trees: Efficient State-Based CRDTs in Open Networks https://hal.inria.fr/hal-02303490/document
https://scholar.google.com/scholar?cites=7160577141569533185... ... "Merkle Hash Grids Instead of Merkle Trees" (2020) https://scholar.google.com/scholar?cluster=13503894708682701...
Browser-side "Blockchain Certificate Transparency" applications need to support at least exact key lookup by domain/SAN and then also by cert fingerprint value; but the whole CT chain with every cert issue and revocation event is impractically large in terms of disk space.
https://github.com/amark/gun#history may also be practically useful.
Upptime – GitHub-powered open-source uptime monitor and status page
Maker here, I made Upptime as a way to scratch my own itch... I wanted a nice status page for Koj (https://status.koj.co) and was previously using Uptime Robot which didn't allow much customization to the website.
It started with the idea of using GitHub issues for incident reports and using the GitHub API to populate the status page. The obvious next step was opening the issues automatically, so the uptime monitor was born. I love using GitHub Actions in interesting ways (https://github.blog/2020-08-13-github-actions-karuna/) and have made it a large part of my toolkit!
https://news.ycombinator.com/item?id=25557032 mentions "~3000 minutes per month". GitLab's new pricing structure: [(runner_minutes, usd_per_month), (400, $0), (2_000, $4), (10_000, $19), (50_000, $99)]
You can run a self-hosted GitHub or GitLab Runner with your own resources: https://docs.github.com/en/free-pro-team@latest/actions/host...
GitLab [Runner] also runs tasks on cron schedules.
The process invocation overhead for CI is greater than for a typical metrics collection process like a nagios check or a memory-resident daemon like collectd with the curl plugin and the "Write HTTP" plugin (if you're not into using a space and time efficient timeseries database for metrics storage)
An open source project with a $5/mo VPS could run collectd in a container with a config file far far more energy efficiently than this approach.
Collectd curl statistics: https://collectd.org/documentation/manpages/collectd.conf.5....
Collect list of plugins: https://collectd.org/wiki/index.php/Table_of_Plugins
Is there a good way to do {DNS, curl HTTP, curl JSON} stats with Prometheus (instead of e.g. collectd as a minimal approach)?
Show HN: Simple-graph – a graph database in SQLite
I wonder if there are ways, in SQLite, to build indices for s,p,o/s,p/p,o/ and maybe more subtle ones... That would be uber nice, given the fact that most graph databases have their own indexing strategies, and you cannot craft your own.
rdflib-sqlalchemy is a SQLAlchemy rdflib graph store backend: https://github.com/RDFLib/rdflib-sqlalchemy
It also persists namespace mappings so that e.g. schema:Thing expands to http://schema.org/Thing
The table schema and indices are defined in rdflib_sqlalchemy/tables.py: https://github.com/RDFLib/rdflib-sqlalchemy/blob/develop/rdf...
You can execute SPARQL queries against SQL, but most native triplestores will have a better query plan and/or better performance.
Apache Rya, for example:
> indexes SPO, POS, and OSP.
Thanks for your comment. I use rdflib frequently but have never tried the SQLAlchemy back end. Now I will. That said, Jena or Fuseki, or the commercial RDF stores like GraphDB, Stardog, and Allegrograph are so much more efficient.
In CPython, types implemented in C are part of the type tree
The docs should have coverage on this:
Python/C API Reference Manual: https://docs.python.org/3/c-api/index.html
Python/C API Reference Manual » Object Implementation Support > Type Objects: https://docs.python.org/3/c-api/typeobj.html
CPython Devguide > Exploring Python Internals > Additional References: https://devguide.python.org/exploring/
Experiments on a $50 DIY air purifier that takes 30s to assemble
From "Better Box Fan Air Purifier" https://tombuildsstuff.blogspot.com/2013/06/better-box-fan-a... :
> Air purifiers can be expensive and you've probably seen articles recommending to just put a 20" x 20" x 1" furnace filter on a cheap 20" box fan and POOF! instant cleaner air for not a lot of money. It really does clean the air pretty cheap.
> There's a problem with this though. These fans weren't designed to be run with a filter. The filter will restrict air flow which will put a higher strain on the motor causing it to use more electricity and in worse cases could be a fire hazard. The higher the MERV rating (cleaning efficiency) of the filter the more stress it will put on the fan.
> Don't worry! You can still have your cheap air purifier as long as the filter area is increased to decrease the effect of air resistance. Instead of using one 20x20x1 filter we'll use two 20x25x1 filters which increases the filter surface area over 250%. It's a little more expensive because you're using two filters instead of one but the increased filter surface area also helps the filter last longer before it gets clogged up and we're saving on energy use compared to a single filter.
> There's a problem with this though. These fans weren't designed to be run with a filter. The filter will restrict air flow which will put a higher strain on the motor causing it to use more electricity and in worse cases could be a fire hazard. The higher the MERV rating (cleaning efficiency) of the filter the more stress it will put on the fan.
I'm fairly certain the opposite is true. In fact, most rowing machines work on this principle. The easiest setting is the one where the fan is as closed off as possible because it's pulling a vacuum. Less air, less resistance, easier to row / less power required to spin the fan. If anything, fans should draw less current with air flow on the inlet side restricted.
A. Putting two [larger] filters in a 'V' with cardboard to fill the top and bottom pulls the same amount of air through a larger area of filters
B. pulling the same volume of air through greater surface area results in greater pressure between the filter and fan than one filter directly affixed to the fan
C. The lower air pressure / "suction" due to an obstructed intake causes an electric fan motor to fail more quickly.
D. Increasing the air pressure that the motor is in reduces the failure rate?
Goodreads plans to retire API access, disables existing API keys
This makes absolutely no sense and has no relation to any economic variables. Goodreads isn’t some struggling self-funded startup — it’s owned by Amazon.com. The acquisition was a deal that should have never been approved, if the Obama administration had been anything beyond completely impotent at protecting us from monopoly games:
https://www.theguardian.com/books/2013/apr/02/amazon-purchas...
I would like to understand the true strategic interest behind this. Is Amazon simply penny-pinching now that they’ve successfully obliterated the market for both new and used books online? There’s way more to this story than appears on the surface.
It's not mentioned in the article but Amazon had disabled their affiliate link program for Goodreads ahead of this announcement, which cut off a major source of their revenue for them, and forced them to sell. They had no choice.
The strategic reason for Amazon is obvious. As someone else mentioned, Amazon doesn't want Goodreads data to be used to add value to their competitors' offerings.
Speaking as a developer who tried to build on top of Goodreads API, I also want to add that this was a long time coming. The API had been neglected for some time. And some of the most interesting datasets weren't even made available through the API.
Apparently Amazon’s Kindle lost the ability to share progress on Goodreads in the last week as well.
I guess the writings on the wall...
Turing Tumble Simulator
(Turing Tumble is a (fun) marble-powered computer game.)
Python Pip 20.3 Released with new resolver
If anyone hasn't seen it, now is a good time to look at https://python-poetry.org/ It is rapidly becoming _the_ package manager to use. I've used it in a bunch of personal and professional projects with zero issues. It's been rock solid so far, and I'm definitely a massive fan.
I occasionally ask our principle / sr. Python engineers about this, and their response is always, "These things come and go, virtualenv/wrappers + pip + requirements.txt works fine - no need to look at anything else."
We've got about 15 repos, with the largest repo containing about 1575 files and 34MBytes of .py source, 14 current developers (with about 40 over the last 10 years) - and they really are quite proficient, but haven't demonstrated any interest at looking at anything outside pip/virtualenv.
Is there a reason to look at poetry if you've got the pip/virtualenv combination working fine?
People who use poetry seem to love it - so I'm interested in whether it provides any new abilities / flexibility that pip doesn't.
beware of zealot geeks bearing gifts. if your environment is currently working fine and you are only interested in running one version of python and perhaps experimenting with a later one then venv + pip is all you need, with some wrapper scripts as you say to make it ergonomic (to set the PYTHON* environment variables for your project, for example)
The main reason I use Poetry is for its lockfile support. I've written about why lockfiles are good here: https://myers.io/2019/01/13/what-is-the-purpose-of-a-lock-fi...
Pip supports constraints files. https://pip.pypa.io/en/stable/user_guide/#constraints-files :
> Constraints files are requirements files that only control which version of a requirement is installed, not whether it is installed or not. Their syntax and contents is nearly identical to Requirements Files. There is one key difference: Including a package in a constraints file does not trigger installation of the package.
> Use a constraints file like so:
python -m pip install -c constraints.txtWhat does that command do? Does it install dependencies, or simply verify the versions of the installed dependencies?
It just gives you an error:
ERROR: You must give at least one requirement to install (see "pip help install")
So it seems like a strange choice of usage example. You have to provide both requirements and constraints for it to do anything useful (applying the version constraints to the requirements and their dependencies).Your package manager should be boring, extremely backward and forward compatible, and never broken. Experience has shown this not to be true for python. Several times over the years i’ve found myself, pinning, upgrading, downgrading, or otherwise juggling versions of setuptools and pip in order to work around some bug. Historically I have had far more problems with the machinery to install python packages I have had with all of the other python packages being installed combined, and that is absurd.
Convolution Is Fancy Multiplication
FWIW, (bounded) Conway's Game of Life can be efficiently implemented as a convolution of the board state: https://gist.github.com/mikelane/89c580b7764f04cf73b32bf4e94...
How to better ventilate your home
I recently purchased a home and spent some time improving ventilation in it this year. This is a home in Minneapolis, so just "opening a window" is not an option, as it would be too cold in the winter and dramatically reduce energy efficiency. Its effectiveness also changes based on factors like the wind, and the temperature difference between inside and out.
The general strategy for doing this right now with existing homes is to insulate your home as well as possible for energy efficiency, then install a continuous ventilation fan. This is essentially a bathroom fan, except that it runs all the time at a constant low speed, helping to circulate the air through and out your house by "pulling" a designed amount of air through the cracks.
I didn't want to punch a new hole in the house just to do this, and already had a bathroom fan installed, so as a hack I just turned it into a whole house ventilation fan with this: https://www.aircycler.com/pages/smartexhaust
Basically you calculate how much CFM you need per hour based on the square footage of your house, and then you set it on the fan control. It still acts like a bathroom fan, except every hour it also runs for a set period of time (in my case, about 12 minutes).
The standard for this is ASHRAE 62.2. Use this to calculate the CFM for code: https://homes.lbl.gov/ventilate-right/step-3-whole-building-...
Then the formula for calculating the fan run time is on this sheet: https://cdn.shopify.com/s/files/1/0221/7316/files/AC_DOC_7_0...
And presto, you have a ventilation system without having to do a lot of work. Do _not_ try this with a crappy, rusty old bathroom fan - clean or replace the motor first, and for extra credit, use an arc fault circuit interrupter on the breaker so if the motor fails it will blow the fuse instead of potentially causing a fire.
Note: This is really just to manage general air quality and VOCs. If you want to specifically make ventilation for COVID-19, that's a different problem. They focus on Air Changes per Hour (ACH), and a cubic feet calculation is used rather than square feet. There's no "recommended amount" of ACH for managing COVID-19. You're likely improving the situation by increasing it, but I wouldn't start inviting people over after you did it. ACH is very high in ICUs but staff are still getting sick there.
RE Humidity - ideal range varies based on region and outside temperature, but this chart roughly shows it: https://lh3.googleusercontent.com/proxy/mPz-jGdLpnPGgvWR3Egf...
I have an on-furnace humidifier controlled by an ecobee for the winter, it's a huge quality of life improvement if you live in cold climates, but make sure to set it to "frost control" otherwise it won't lower the humidity based on the outside air and you can get mold in your walls. For summer, a standard house A/C combined with continuous ventilation should be sufficient to bring down humidity levels.
Finally, the "correct" air ventilation is a moving target with trade-offs and concerns of the moment. It was higher in 1925 (30cfm/person) to try to prevent tuberculosis and infectious diseases, then was lowered to 5cfm/person in the 70s during the energy crisis, and is currently at 15cfm/person. I imagine COVID-19 could make us re-consider the current recommendations. https://homes.lbl.gov/ventilate-right/ashrae-standard-622
I'm in an older home that leaks a sieve but has central air. Would turning on that house fan to circulate without running the AC accomplish the same thing? Also, do any smart thermostats allow this sort of thing to be automated?
"Fan control with a Nest thermostat" https://support.google.com/googlenest/answer/9296419?hl=en
Looks like there could be: (1) an every hour for n minutes schedule; (2) an option to run the fan with the thermostat off; (3) an option to shut off the fan when everyone is gone
Quantum-computing pioneer Peter Shor warns of complacency over Internet security
If an organization has a 5 year refresh cycle (~time to implement a new IT system), and there exists a quantum computer with a sufficient number of error-corrected qubits by 2027 [1], an organization/industry has 5 years from 2022 to go quantum-resistant: replace their existing solution with quantum-resistant algos (and, in some cases, a DLT with a coherent pan-industry API) and/or double their RSA and ECDSA key sizes.
[1] "Quantum attacks on Bitcoin, and how to protect against them (ECDSA, SHA256)" https://news.ycombinator.com/item?id=15907523
Which DLT/blockchains without PKI (or DNS) will implement the algorithms selected from the NIST Post-Quantum Cryptography (PQC) round 3 candidate algorithms? https://csrc.nist.gov/projects/post-quantum-cryptography
CERN Online introductory lectures on quantum computing from 6 November
This is probably a dumb question but are there any data sciencey or tensorflowy things that can be done faster on a quantum computer?
There are some basic linear algebra subroutines (Matrix inversion, finding eigenvalues & eigenvectors) that can be performed with an exponential speedup on a quantum computer in theory, that's why there is so much interest in Quantum Machine Learning. If you are asking about the current hardware level, then no, current quantum computers can not solve any practical problem faster than a classical computer.
In theory classical computers are also not limited to have better solutions to problems where quantum computers claim superiority like prime factorisation or like Tang's quantum inspired classical algorithm that beats HHL for low rank matrices.
> There are some basic linear algebra subroutines (Matrix inversion, finding eigenvalues & eigenvectors) that can be performed with an exponential speedup on a quantum computer in theory
Eh... those particular subroutines have polynomial time algorithms already on a classical computer.
You can't exponentially speed up something that's polynomial time already.
https://quantumalgorithmzoo.org/ lists algorithms, speedups, and descriptions.
That's a great list, thanks!
(Though for the benefit of readers here, the list doesn't include any "basic linear algebra subroutines (Matrix inversion, finding eigenvalues & eigenvectors)").
A Manim Code Template
The demo video looks cool. It's maybe not obvious that there's a link to the code-video-generator (which is built on manim by 3blue1brown) demo video in the README: https://youtu.be/Jn7ZJ-OAM1g
Startup Financial Modeling: What is a Financial Model? (2016)
https://www.causal.app/ has free business model templates: SaaS (Foresight), eCommerce (https://foresight.is/), Startup Runway, Buy/Rent, Ads Calculator
The article explicitly, repeatedly says templates are a bad idea. If you disagree, it'd be interesting to see your reasoning, rather than links to free templates.
We'd do better to find a list of business modeling books and tools.
And then take a look at integrating actual data sources; hopefully some quantitative with APIs.
Uncertainties supports mean±"error" w/ "error propagation": https://pypi.org/project/uncertainties/
Sliders etc can be done in Jupyter notebooks with e.g. ipywidgets: https://ipywidgets.readthedocs.io/en/latest/
At what grade level do presidential candidates debate?
Intelligence does not imply superior moral, ethical, or rational judgement.
Incomplexity of speech does not imply lack of intelligence.
Here's the section on Simple English in Simple English Wikipedia: https://simple.wikipedia.org/wiki/Wikipedia:About#Simple_Eng...
Imagine being reprimanded for use of complex words and statistical terms in an evidence-based policy discussion in a boardroom. Imagine someone applying to be CEO, President, or Chairman of the Board and showing up without a laptop, any charts, or any data.
Topicality!
Perhaps there is a better game for assessing competency to practice evidence-based policy.
This commenter effectively refutes the claim that Fleisch-Kincaid is a useful metric for assessing the grade-level of interpretively-punctuated spoken language: https://news.ycombinator.com/item?id=24807610
Like I said, from "Ask HN: Recommendations for online essay grading systems?" https://news.ycombinator.com/item?id=22921064 :
> Who else remembers using the Flesch-Kincaid Grade Level metric in Word to evaluate school essays? https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readabi...
> Imagine my surprise when I learned that this metric is not one that was created for authors to maximize: reading ease for the widest audience is not an objective in some deparments, but a requirement.
> What metrics do and should online essay grading systems present? As continuous feedback to authors, or as final judgement?
That being said, disrespectful little b will not be tolerated or venerated by the other half of the curve.
ElectricityMap – Live CO₂ emissions of electricity production and consumption
What's going on in Queensland, Australia? "793g Carbon intensity" seems insanely high, especially as compared with Norway/France/Sweden (at 24-41g carbon intensity).
Does anyone have insight into this? I've never pictured Australia as a big contributor to pollution.
Good observation. I was curious, and it looks like the parser code to retrieve the data is open source; so here's the relevant module for Australia:
https://github.com/tmrowco/electricitymap-contrib/blob/24ea2...
The production mix (energy generation by source) source URL is listed on L326.
Opening the dataset in LibreOffice Calc, filtering by Region = QLD1, filtering for Current Output > 0, and sorting by Current Output descending does seem to show that the vast bulk of production reported has fuel source descriptor 'Black Coal'.
If that's a bug in the reported data or source, hopefully the ElectricityMap team can jump in to take a look. The data initially seems legit, from searching for some of the power stations listed.
Edit / addendum: worth searching for Australia in the project's GitHub issues too; there's some existing (although not directly related) investigation & validation work ongoing for that module
Also - home solar penetration in QLD is the highest in Australia. Is that not factored in to the equation? It doesn't look that way given 0% renewables shown.
This website is pulling information from public sources, so it can only show public generation & consumption information.
If someone is running a grid-tied solar system, outbound energy might be accounted for in these statistics, but it won't show any immediate consumption of that solar generation by the property, nor will it show any battery storage, it's only capable of showing what goes through the energy meter.
+1. California has substantial rooftop solar (11 Gwh peak) behind the meter, which isn’t shown in these graphs.
https://pv-magazine-usa.com/2019/04/15/californias-solar-pow...
How would behind the meter electricity consumption change the reported amount of CO2 emitted by other electricity production sources?
The map is intended to show the emissions intensity of electricity so if you miss a bunch of electricity from distributed generation you will over-estimate the intensity.
Man that was a dumb question. Thanks for clarifying.
So we'd need electric utility companies to share the live data of how many kwh of solar and wind people are selling back to the grid in order to get an accurate regional comparison of real-time carbon intensity?
FWIU, they're already parsing the EIA data; but it's significantly more delayed than the max 2 hour delay specified by ElectricityMap.
Here's the parser for the current data from EIA: https://github.com/tmrowco/electricitymap-contrib/blob/maste...
Should the EIA (1) source, aggregate, cache, and make more real-time data available; and (2) create a new data item for behind the meter kwh from e.g. residential wind and solar?
(Edit) "Does EIA publish data on peak or hourly electricity generation, demand, and prices?" https://www.eia.gov/tools/faqs/faq.php?id=100&t=3
> Hourly Electric Grid Monitor is a redesigned and enhanced version of the U.S. Electric System Operating Data tool. It incorporates two new data elements: hourly electricity generation by types of energy/fuel source and hourly sub-regional demand for certain balancing authorities in the Lower 48 states.
> [...]
> EIA does not publish hourly electricity price data, but it does publish wholesale electricity market information including daily volumes, high and low prices, and weighted-average prices on a biweekly basis.
AFAIU, retail intraday rates aren't yet really a thing in the US; but some countries in Europe do have intraday rates (which create incentives for the grid scale energy storage necessary for wide-scale rollout of renewables).
(Edit) "Introduction to the World of Electricity Trading" https://www.investopedia.com/articles/investing/042115/under... :
> Energy prices are influenced by a variety of factors that affect the supply and demand equilibrium. On the demand side, commonly referred to as a load, the main factors are economic activity, weather, and general efficiency of consumption. On the supply side, commonly referred to as generation, fuel prices and availability, construction costs and the fixed costs are the main drivers of the price of energy. There's a number of physical factors between supply and demand that affect the actual clearing price of electricity. Most of these factors are related to the transmission grid, the network of high voltage power lines and substations that ensure the safe and reliable transport of electricity from its generation to its consumption.
Which customers (e.g. data centers, mining firms) would take advantage of retail intraday rates?
How does cost and availability of storage affect the equilibrium price of electricity?
>So we'd need electric utility companies to share the live data of how many kwh of solar and wind people are selling back to the grid in order to get an accurate regional comparison of real-time carbon intensity?
You actually need production numbers for what's used "behind the meter" which by its nature is not directly available live or otherwise and has to be estimated.
>Which customers (e.g. data centers, mining firms) would take advantage of retail intraday rates?
Loads! In the UK you can get half-hourly priced electricity even at a domestic level (with a smart meter) and if you have loads that can be be re-scheduled (mainly EV charging but if you had an electric storage heater that would work as ell) you can save quite a lot of money.
Heavy industrial users definitely move usage around both to avoid expensive wholesale charges but also to reduce their transmission connection charges which (in the UK) are based on their usage during the most congested periods of the year. Water companies will vary pump operations for this reason and water pumping and treatment alone is 2% of electricity use.
Data centers definitely pay on a half-hourly settled basis but tend not to shift their workloads around to take advantage, some data centers will run their cooling systems in such a way as to reduce usage during the most expensive few half hours though. I have heard that larger users like Amazon, FB, Google, will automatically load balance between global centers to reduce electricity bills and carbon footprint.
> how many kwh of solar and wind people are selling back
Not just selling back. Producing - and then either using or selling.
This USA data seems kind of strange to me. For instance my home state, Kansas, we are fourth in the nation for installed wind capacity, supplying 41% of our energy demand. Our sole nuclear power plant provides an 18% base load. 33% is coal and the rest random crap. We're very population sparse and low population in general. Yet in other areas of the country that are strictly coal and natural gas fired, with a greater population density and higher total population, they're less turd colored. I'm guessing this is a strange interaction between renewable availability or it just might be a lack of available data.
From https://github.com/tmrowco/electricitymap-contrib#data-sourc... :
> Here are some of the ways you can contribute:
> Building a new parser, Fixing a broken parser, Changes to the frontend, Find data sources, Verify data sources, Translating electricitymap.org, Updating region capacities
I sent a few tweets and emails about the data in this region but nothing happened here either
Bash Error Handling
From https://twitter.com/b0rk/status/1312413117436104705 :
> TIL that you can use the "DEBUG" trap to step through a bash script line by line
trap '(read -p "[$BASH_SOURCE:$LINENO] $BASH_COMMAND?")' DEBUG
> [...] it does something very different than sh -x — sh -x will just print out lines, this stops before* every single line and lets you confirm that you want to run that line*>> you can also customize the prompt with set -x
export PS4='+(${BASH_SOURCE}:${LINENO}) '
set -x
With a markdown_escape function, could this make for something like a notebook with ```bash fenced code blocks with syntax highlighting?A Customer Acquisition Playbook for Consumer Startups
> For consumer companies, there are only three growth “lanes” that comprise the majority of new customer acquisition:
> 1. Performance marketing (e.g. Facebook and Google ads)
> 2. Virality (e.g. word-of-mouth, referrals, invites)
> 3. Content (e.g. SEO, YouTube)
> There are two additional lanes (sales and partnerships) which we won't cover in this post because they are rarely effective in consumer businesses. And there are other tactics to boost customer acquisition (e.g PR, brand marketing), but the lanes outlined above are the only reliable paths for long-term and sustainable business growth.
Marketing calls those "channels". I don't think they're exclusive categories: a startup's YouTube videos could be supporting a viral marketing campaign, for example; ads aren't the only strategy for (targeted) social media marketing; if the "ask" / desired behavior upon receiving the message is to share the brand, is that "viral"?
What about Super Bowel commercials?
Traditional marketing: press releases, (linked-citation-free) news wires, quasi-paid interviews, "news program" appearances, product placement.
"Growth hacking": https://en.wikipedia.org/wiki/Growth_hacking
Gathering all open and sustainable technology projects
Great list! https://github.com/protontypes/awesome-sustainable-technolog...
"Earth" topic label: https://github.com/topics/earth
Though there are unfortunately zero (0,) results for "sustainability-report[s,ing]", it may be useful to have a heading: https://github.com/topics/sustainability-reporting ... GRI Reporting Standards & XBRL: https://en.wikipedia.org/wiki/Global_Reporting_Initiative , Sustainability Cloud, https://en.wikipedia.org/wiki/Sustainability_reporting
TIL about the "SWEET" (Semantic Web for Earth and Environmental Terminology) Ontologies: https://github.com/ESIPFed/sweet
SunPy: https://docs.sunpy.org/en/stable/generated/gallery/index.htm...
Jupyter Notebooks Gallery
This is not a valid Show HN, so we've taken that out of the title. Please see the rules: https://news.ycombinator.com/showhn.html. Note the word 'lists'.
Lists tend not to make good HN submissions because the only thing one can really discuss about them is the lowest common denominator (or greatest common factor?) of the list elements. It would be better to pick one of the most interesting elements and submit that instead.
HN is itself a list, and a pointer to a pointer to a pointer is too much indirection.
https://hn.algolia.com/?query=denominator%20list%20by:dang&d...
Jupyter/Jupyter > Wiki > "A gallery of interesting Jupyter Notebooks" lists hundreds of notebooks: https://github.com/jupyter/jupyter/wiki/A-gallery-of-interes...
The mybinder.org Grafana dashboard lists the most popular notebook repos in the last hour: https://grafana.mybinder.org/
Jupyter/Nbviewer > FAQ > "How do you choose the notebooks featured on the nbviewer.jupyter.org homepage?" :
> We originally selected notebooks that we found and liked. We are currently soliciting links to refresh the home page using a Google Form. You may also open an issue with your suggestion.
https://nbviewer.jupyter.org/faq#how-do-you-choose-the-noteb...
Google Form: https://docs.google.com/forms/d/e/1FAIpQLSd6AlVvC7KagENypGTc...
Here's the Nbviewer source code. AMP (And https://schema.org/ScholarlyArticle / Book / CreativeWork metadata) could be useful. https://github.com/jupyter/nbviewer
Jupyter notebooks are almost a huge step forward for programming. The instant feedback, saving of partial program state, and visualizations mixed in are amazing and fun to work with. The fact that it's taking place in a browser, you don't have modal editing with Vim, you don't get plugin support (like MyPy), and you don't have intellisense is a huge step backwards.
I'm hoping that someone addresses these issues, I know VSCode is trying, and Scala "Metals" worksheets are another similar concept (though different, it puts variable values in comments next to code as you update the code).
I feel like all of these are a step closer than what the Lighttable/Eve teams were shooting for, but not quite the next generation way of coding that it could be.
I'm a huge fan of rmarkdown since it addresses many of these shortcomings by way of being much simpler than jupyter notebooks. The files are simply text files, and you can write python too if you like, they aren't just for R.
NestedText, a nice alternative to JSON, YAML, TOML
So, this is basically YAML, "but better". I can repeat once more that "easily understood and used by both programmers and non-programmers" is unapologetically stupid concept that can never succeed. So I see how all of this will sound all too familiar to anybody with a little experience, which makes them to automatically dismiss this YAYAML.
But YAML is really quite complicated, and JSON (which shouldn't be used for config files at all) and TOML (which I love and wish it would gain more popularity) aren't exactly alternatives to YAML. So, I would be actually totally ok with "YAML, but better", as a way to deprecate YAML.
Now, it is clear from the start that this cannot deprecate YAML, because it doesn't even have booleans and numbers. But, surprisingly, I can accept this as well: ok, let's just assume that being good at dealing with strings may be enough.
The problem is, it isn't clear at all from the docs, if this is better than YAML at anything. It raises dozens of questions. I'll start with the most basic ones (using [] as a wrapper/delimiter): how do I represent values [ a], [a ], ["a"] and [""] in this file format?
What kind of values are [<whitespace>a] and [a<whitespace>] supposed to be? They look like typical YAML syntax traps to me.
Why should JSON never be used for configuration? It is sufficient for declaratively expressing anything I have encountered. Do we really need references or other stuff from YAML? For configuration this seems unnessecary, provided that the program, which interprets the result of parsing the JSON is well written.
JSON lacks comments and will fail for a missing or extra comma, so it's not great for configuration written by humans.
You can use HJSON which is the json with comments. It's fully compatible with json so easy to introduce into anything that does json. https://hjson.github.io/
JSON5 also supports comments and multiline strings with `\`-escaped newlines: https://json5.org/
Triple-quoted multiline strings like HJSON would be great, too.
From "The description of YAML in the README is inaccurate" https://github.com/KenKundert/nestedtext/issues/10 :
> I will mention something else. The section about the "Norway problem" is not quite accurate. Some YAML loaders do in fact load no as false. These are usually YAML 1.1 loaders. YAML 1.2's default schema is the same as JSON's (only true, false, 'null and numbers are non-strings).
> Any YAML loader is free to use any schema it wants. That is, no loader is required to to load no as false. Good loaders should support multiple schemas and custom schemas. The Norway problem isn't technically a YAML problem but a schema problem.
> imho, YAML's biggest failing to date is not making things like this clear enough to the community.
> Note: PyYAML has a BaseLoader schema that loads all scalar values as strings.
Algorithm discovers how six molecules could evolve into life’s building blocks
It'd be nice if they could do the same for protein synthesis, but that's obviously a much harder problem.
Folding@home https://en.wikipedia.org/wiki/Folding@home :
> Folding@home (FAH or F@h) is a distributed computing project aimed to help scientists develop new therapeutics to a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements of proteins, and is reliant on the simulations run on the volunteers' personal computers.
"AlphaFold: Using AI for scientific discovery" (2020) https://deepmind.com/blog/article/AlphaFold-Using-AI-for-sci...
https://www.kdnuggets.com/2019/07/deepmind-protein-folding-u... :
> At last year’s Critical Assessment of protein Structure Prediction competition (CASP13), researchers from DeepMind made headlines by taking the top position in the free modeling category by a considerable margin, essentially doubling the rate of progress in CASP predictions of recent competitions. This is impressive, and a surprising result in the same vein as if a molecular biology lab with no previous involvement in deep learning were to solidly trounce experienced practitioners at modern machine learning benchmarks.
Citations of "Resource-efficient quantum algorithm for protein folding" (2019) https://scholar.google.com/scholar?cites=1037213034434902738...
Protein folding: https://en.wikipedia.org/wiki/Protein_folding
I always wonder how many open research problems could benefit from a working hand from computing. Sometimes I wish research would be as available as an open-source project.
As a computer scientist who worked in a physics lab: a lot. Other sciences could desperately use the help of talented computer scientists. Unfortunately those other sciences are mostly mired in academia nonsense and don't pay well.
"Applied CS"
Computational science: https://en.wikipedia.org/wiki/Computational_science
Computational biology: https://en.wikipedia.org/wiki/Computational_biology
Computational thinking: https://en.wikipedia.org/wiki/Computational_thinking :
> The characteristics that define computational thinking are decomposition, pattern recognition / data representation, generalization/abstraction, and algorithms.
Additional skills useful for STEM fields: system administration / DevOps / DevSecOps, HPC: High Performance Computing (distributed systems, distributed algorithms, performance optimization; rewriting code that is designed to test unknown things with tests and for performance), research a graph of linked resources and reproducibly publish in LaTeX and/or computational notebooks such as Jupyter notebooks, dask-labextension, open source tool development (& sustainable funding) that lasts beyond one grant
Physicists build circuit that generates clean, limitless power from graphene
"Fluctuation-induced current from freestanding graphene" (2020) https://journals.aps.org/pre/abstract/10.1103/PhysRevE.102.0... https://doi.org/10.1103/PhysRevE.102.042101
> In the 1950s, physicist Léon Brillouin published a landmark paper refuting the idea that adding a single diode, a one-way electrical gate, to a circuit is the solution to harvesting energy from Brownian motion. Knowing this, Thibado's group built their circuit with two diodes for converting AC into a direct current (DC). With the diodes in opposition allowing the current to flow both ways, they provide separate paths through the circuit, producing a pulsing DC current that performs work on a load resistor.
> Additionally, they discovered that their design increased the amount of power delivered. "We also found that the on-off, switch-like behavior of the diodes actually amplifies the power delivered, rather than reducing it, as previously thought," said Thibado. "The rate of change in resistance provided by the diodes adds an extra factor to the power."
> The team used a relatively new field of physics to prove the diodes increased the circuit's power. "In proving this power enhancement, we drew from the emergent field of stochastic thermodynamics and extended the nearly century-old, celebrated theory of Nyquist," said coauthor Pradeep Kumar, associate professor of physics and coauthor.
> According to Kumar, the graphene and circuit share a symbiotic relationship. Though the thermal environment is performing work on the load resistor, the graphene and circuit are at the same temperature and heat does not flow between the two.
> That's an important distinction, said Thibado, because a temperature difference between the graphene and circuit, in a circuit producing power, would contradict the second law of thermodynamics. "This means that the second law of thermodynamics is not violated, nor is there any need to argue that 'Maxwell's Demon' is separating hot and cold electrons," Thibado said.
I don't understand. It sounds like they're drawing useful work (I think?) from a system that's already at thermodynamic equilibrium. Doesn't that violate the 2nd law of thermodynamics?
I'm not sure that I understand either. From the abstract (which phys.org failed to link to):
> The system reaches thermal equilibrium and the rates of heat, work, and entropy production tend quickly to zero. However, there is power generated by graphene which is equal to the power dissipated by the load resistor.
Looks like the article is also on ArXiV: https://arxiv.org/abs/2002.09947
https://scholar.google.com/scholar?oi=bibs&hl=en&cluster=103...
Is it really a closed system at equilibrium?
Mozilla shuts project Iodide: Datascience documents in browsers
I did this! I killed it and I didn't mean to.
Ten (10) days ago, I filed an issue in the iodide project: "Compatibility with 'percent' notebook format" https://github.com/iodide-project/iodide/issues/2942
And then six (6) days ago, I added this comment to that issue: https://github.com/iodide-project/iodide/issues/2942#issueco...
And now it's almost dead, and I didn't mean to kill it.
But I also suggested that it would be great if conda-forge had a WASM build target:
- "Consider moving CPython patches upstream" https://github.com/iodide-project/pyodide/issues/635#issueco...
For students, being able to go to a URL and have a notebook interface with the SciPy stack preinstalled without needing to have an organization manage shell accounts and/or e.g. JupyterHub for every student should be worth the necessary budget allocation. Their local machines have plenty of CPU, storage, and memory for all but big data workloads.
Iodide is/was really cool. Pyiodide (much of the SciPy stack compiled to WASM) is also a great idea.
Jyve with latest JupyterLab, nbgrader, and configurable cloud storage could also solve.
Google Colab is an alternative to consider if you need to share Jupyter notebooks: https://colab.research.google.com/notebooks/intro.ipynb.
There are many ways to share reproducible Jupyter notebooks.
Google Colab now supports ipywidgets (js) in notebooks. While you can install additional packages in Colab, additional packages must be installed by each user (e.g. with `! pip install sympy` in an initial input cell) for each new kernel.
repo2docker builds a docker image from software dependency versions specified in e.g. requirements.txt, environment.yml, and/or a postInstall script and then installs a current version of JupyterLab in the container. Zero-to-BinderHub describes how to get BinderHub (which builds and launches containers) running on a hosting provider w/ k8s. awesome-python-in-education/blob/master/README.md#jupyter
Google AI Platform Notebooks is hosted JupyterLab.
awesome-jupyter > Hosted Notebook Solutions lists a number of services: https://github.com/markusschanta/awesome-jupyter#hosted-note...
awesome-python-in-education > Jupyter links to many Jupyter resources like nbgrader and BinderHub but not yet Jyve: https://github.com/quobit/awesome-python-in-education#jupyte...
Ask HN: What are good life skills for people to learn?
My initial thoughts; learn to drive, first aid, a sport, play an instrument, a language, how to manage finances, to speak in front of people.
- "Consumer science (a.k.a. home economics) as a college major" https://news.ycombinator.com/item?id=17894550
In no particular order:
- Food science; Nutrition
- Family planning: https://en.wikipedia.org/wiki/Family_planning
- Personal finance (see the link above for resources)
- How to learn
- How to teach [reading and writing, STEM, respect, compassion]
- Compassion for others' suffering
- How to considerately escape from unhealthy situations
- Coping strategies: https://en.wikipedia.org/wiki/Coping
- Defense mechanisms: https://en.wikipedia.org/wiki/Defence_mechanism
- Prioritization; productivity
- Goal setting; n-year planning; strategic alignment
Life skills: https://en.wikipedia.org/wiki/Life_skills
Khan Academy > Life Skills: https://www.khanacademy.org/college-careers-more
Four Keys Project metrics for DevOps team performance
> […] four key metrics that indicate the performance of a software development team:
> Deployment Frequency - How often an organization successfully releases to production
> Lead Time for Changes - The amount of time it takes a commit to get into production
> Change Failure Rate - The percentage of deployments causing a failure in production
> Time to Restore Service - How long it takes an organization to recover from a failure in production
Ask HN: Resources to encourage teen on becoming computer engineer?
Howdy HN
A teenager I am close with would like to become a computer engineer. Whet resources, books, podcasts, camps, or experiences do you recommend to support this teen's endeavor?
"Ask HN: Something like Khan Academy but full curriculum for grade schoolers?" [through undergrads] https://news.ycombinator.com/item?id=23794001
"Ask HN: How to introduce someone to programming concepts during 12-hour drive?" https://news.ycombinator.com/item?id=15454071
"Ask HN: Any detailed explanation of computer science" https://news.ycombinator.com/item?id=15270458 : topologically-sorted? Information Theory and Constructor Theory are probably at the top:
> A bottom-up (topologically sorted) computer science curriculum (a depth-first traversal of a Thing graph) ontology would be a great teaching resource.
> One could start with e.g. "Outline of Computer Science", add concept dependency edges, and then topologically (and alphabetically or chronologically) sort.
> https://en.wikipedia.org/wiki/Outline_of_computer_science
> There are many potential starting points and traversals toward specialization for such a curriculum graph of schema:Things/skos:Concepts with URIs.
> How to handle classical computation as a "collapsed" subset of quantum computation? Maybe Constructor Theory?
> https://en.wikipedia.org/wiki/Constructor_theory
https://westurner.github.io/hnlog/ ... Ctrl-F "interview", "curriculum"
CadQuery: A Python parametric CAD scripting framework based on OCCT
I wish more people would try this over openscad. I like openscad but the UI and with cadquery you use python! and now I noticed "CadQuery supports Jupyter notebook out of the box"
The jupyter-cadquery extension renders models with three.js via pythreejs in a sidebar with jupyterlab-sidecar: https://github.com/bernhard-42/jupyter-cadquery#b-using-a-do...
https://github.com/bernhard-42/jupyter-cadquery/blob/master/...
Array Programming with NumPy
Looks like there's a new citation for NumPy in town.
"Citing packages in the SciPy ecosystem" lists the existing citations for SciPy, NumPy, scikits, and other -Py things: https://www.scipy.org/citing.html ( source: https://github.com/scipy/scipy.org/blob/master/www/citing.rs... )
A better way to cite requisite software might involve referencing a https://schema.org/SoftwareApplication record in JSON-LD, RDFa, or Microdata; for example: https://news.ycombinator.com/item?id=24489651
But there's as of yet no way to publish JSON-LD, RDFa, or Microdata Linked Data from LaTeX with Computer Modern.
For some reason this struck me as inappropriate for the outlet. It's a nice piece as an introduction to array programming with numpy, but seemed out of place to me.
If, going forward, 5% of all papers that use NumPy to get their results actually cite this paper, it will be one of Nature's most cited papers every year.
There's an interesting trend of what content gets published in peer-reviewed journals vs. blogs/github/etc. I suspect there is an audience segment that strongly values peer reviewed pieces that are equivalent content wise to introductory material in a variety of formats.
I wonder if github should add a "Review" feature to provide a similar content authoring experience.
It would be nice if citing repositories were easier-- either for generating a reference for my own code or acknowledging when I've used someone else's code in my research.
There's tons of math and physics blogs that contain useful results that the author wanted to make available but didn't manage to incorporate into a paper. I wonder if there'd be any interest in a sort of GitHub for proofs? It could even use git, since (assuming consistency) isn't math just a DAG anyways (and therefore isomorphic to a neural net, as are all things).
You can get a free DOI for and archive a tag of a Git repo with FigShare or Zenodo.
If you have repo2docker REES dependency scripts (requirements.txt, environment.yml, postInstall,) in your repo, a BinderHub like https://mybinder.org can build and cache a container image and launch a (free) instance in a k8s cloud.
Journals haven't yet integrated with BinderHub.
Putting the suggested citation and DOI URI/URL in your README and cataloging citations in an e.g. wiki page may increase the crucial frequency of citation.
A Linked Data format for presenting well-formed arguments with #StructuredPremises would help to realize the potential of the web as a graph of resources which may satisfy formal inclusion criteria for #LinkedMetaAnalyses.
The issue is that none of the citation count engines (Google scholar, scopus, Web of Science...) count citations on those DOIs. So for a researcher who needs to somehow demonstrate impact through citation counts, it does not really help unfortunately.
We could reason about sites that index https://schema.org/ScholarlyArticle according to our own and others' observations. Google Scholar, Semantic Scholar, and Meta all index Scholarly Articles: they copy the bibliographic metadata and the abstract for archival and schoarly purposes.
AFAIU, e.g. Zotero and Mendeley do not crawl and index articles or attempt to parse bibliographic citations from the astounding plethora of citation styles [citationstyles, citationstyles_stylerepo] into a citation graph suitable for representative metrics [zenodo_newmetrics].
bitcoin.org/bitcoin.pdf does not have a DOI, does not have an ORCID [orcid], and is not published in any journal but is indexed by e.g. Google Scholar; though there are apparently multiple records referring to a ScholarlyArticle with the same name and author. Something like "Hell's Angels" (1930)? No DOI, no ORCID, no parseable PDF structure: not indexed.
AFAIU, Google Scholar does not yet index ScholarlyArticle (or SoftwareApplication < CreativeWork) bibliographic metadata. GScholar indexes an older set of bibliographic metadata from HTML <meta> tags and also attempts to parse PDFs. [gscholar_inclusion]
Google Scholar is also not (yet?) integrated with Google Dataset Search (which indexes https://schema.org/Dataset metadata).
FigShare DOIs and Zenodo DOIs are DataCite DOIs [figshare_howtocite, zenodo_principles]; which apparently aren't (yet?) all indexed by Google Scholar [rescience_gscholar].
IIUC, all papers uploaded to https://arxiv.org are indexed by Google Scholar. In order for arxiv-vanity.org [arxiv_vanity] to render a mobile-ready, font-resizeable HTML5 version of a paper uploaded to ArXiV, the PostScript source must be uploaded. Arxiv hosts certain categories of ScholarlyArticles.
JOSS (Journal of Open Source Software) has managed to get articles indexed by Google Scholar [rescience_gscholar]. They publish their costs [joss_costs]: $275 Crossref membership, DOIs: $1/paper:
> Assuming a publication rate of 200 papers per year this works out at ~$4.75 per paper
[citationstyles]: https://citationstyles.org
[citationstyles_stylerepo]: https://github.com/citation-style-language/styles
[gscholar_inclusion]: https://scholar.google.com/intl/en/scholar/inclusion.html#in...
[figshare_howtocite]: https://knowledge.figshare.com/articles/item/how-to-share-ci...
[zenodo_principles]: https://about.zenodo.org/principles/
[zenodo_newmetrics]: https://www.frontiersin.org/articles/10.3389/frma.2017.00013...
[rescience_gscholar]: https://github.com/ReScience/ReScience/issues/38
[arxiv_vanity]: https://www.arxiv-vanity.com/
[joss_costs]: https://joss.theoj.org/about#costs
[orcid]: https://en.wikipedia.org/wiki/ORCID
Do you like the browser bookmark manager?
How do you think it compares to services like webcull.com, raindrop.io, or getpocket.com? Have they advanced the field to the point that it's worth switching?
Things I'd add to browser bookmark managers someday:
- Support for (persisting) bookmarks tags. From the post re: the re-launch of del.icio.us: https://news.ycombinator.com/item?id=23985623
> "Allow reading and writing bookmark tags" https://bugzilla.mozilla.org/show_bug.cgi?id=1225916
> Notes re: how this could be standardized with JSON-LD: https://bugzilla.mozilla.org/show_bug.cgi?id=1225916#c116
> The existing Web Experiment for persisting bookmark tags: https://github.com/azappella/webextension-experiment-tags/bl...
- Standard search features like operators: ((term) AND (term2)) OR term3
- Regex search
- (Chrome) show the createdDate and allow (non-destructive) sort by date
- Native sync API for syncing to zero or more bookmarks / personal data storage providers
- Support for integration with extensions that support actual resource metadata like Zotero
- Linked Data support: extract and store bibliographic metadata like Zotero and OpenLink Structured Data Sniffer
What are the current limitations of the WebExtensions Bookmarks API (now supported by Firefox, Chrome, Edge, and hopefully eventually Safari)?: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...
NIST Samate – Source Code Security Analyzers
Additional lists of static analysis, dynamic analysis, SAST, DAST, and other source code analysis tools:
OWAP > Source Code Analysis Tools: https://owasp.org/www-community/Source_Code_Analysis_Tools
https://analysis-tools.dev/ (supports upvotes and downvotes)
analysis-tools-dev/static-analysis: https://github.com/analysis-tools-dev/static-analysis
analysis-tools-dev/dynamic-analysis: https://github.com/analysis-tools-dev/dynamic-analysis
devsecops/awesome-devsecops: https://github.com/devsecops/awesome-devsecops , https://github.com/TaptuIT/awesome-devsecops
kai5263499/awesome-container-security: https://github.com/kai5263499/awesome-container-security
https://en.wikipedia.org/wiki/DevOps#DevSecOps,_Shifting_Sec... :
> DevSecOps is an augmentation of DevOps to allow for security practices to be integrated into the DevOps approach. The traditional centralised security team model must adopt a federated model allowing each delivery team the ability to factor in the correct security controls into their DevOps practices.
awesome-safety-critical: https://awesome-safety-critical.readthedocs.io/en/latest/
A Handwritten Math Parser in 100 lines of Python
Slightly more amusingly, but non-equivalently:
import sys
from functools import reduce
from operator import add, sub, mul, truediv as div
def calc(s, ops=[(add, '+'), (sub, '-'), (mul, '*'), (div, '/')]):
if not ops:
return float(s)
(func, token), *remaining_ops = ops
return reduce(func, (calc(part, remaining_ops) for part in s.split(token)))
if __name__ == '__main__':
print(calc(''.join(sys.argv[1:])))
Does support order of operations, in spite of appearance. Extending to support parentheses is left as an exercise for the reader :)This is pretty clever. There's a video by Jonathan Blow, creator of Braid and currently making a programming language called Jai, that talks about parsing expressions with order of operations. He gives an simple approach that apparently was discovered recently (though he didn't remember the paper name) that does this in a single pass and hardly any more complicated than what you wrote. You can see it here [1]. He also trashes on using yacc/bison before that and on most academic parsing theory.
Reverse Polish notation (RPN) > Converting from infix notation https://en.wikipedia.org/wiki/Reverse_Polish_notation#Conver... > Shunting-yard algorithm https://en.wikipedia.org/wiki/Shunting-yard_algorithm
Infix notation supports parentheses.
Infix notation: 3 + 4 × (2 − 1)
RPN: 3 4 2 1 − × +
PEP – An open source PDF editor for Mac
I have a dream... that one day people will name their projects with names that don't exist on google yet.
If you search for PEP now you'll find python enhancement proposals, and the "Philippine Entertainment Portal" and the stock code for PepsiCo.
I wish that once people do name their project, they would assign it a 128-bit random number in lower case hex, and include that number on any web page that they would like people searching for their project to find.
That way once I know that say PEP the PDF editor exists and find its 128-bit number (let's say that is 379dd864b16eaca3ce94c15a6bdfcc73), at least I can subsequently toss a +379dd864b16eaca3ce94c15a6bdfcc73 on my searches to effectively let the search engine know I want PEP the PDF editor results rather than PEP the python enhancement results or PEP the entertainment portal results or PEP that refreshing beverage company stock symbol.
"xxd -l 16 -p /dev/urandom" is a handy way to get a 128-bit random hex number. A UUID generator works, too, although they usually include some punctuation you will need to delete and you might have to lower case their output.
> I wish that once people do name their project, they would assign it a 128-bit random number in lower case hex
We already have something similar: URLs.
tzs is proposing URNs rather than textual program names. A URL is unnecessarily specific (though I suppose you could anycast URL resolution)
> RFC 4122 defines a Uniform Resource Name (URN) namespace for UUIDs. A UUID presented as a URN appears as follows:[1]
> > urn:uuid:123e4567-e89b-12d3-a456-426655440000
https://en.wikipedia.org/wiki/Universally_unique_identifier#...
Version 4 UUIDs have 122 random bits (out of 128 bits total).
In Python:
>>> import uuid
>>> _id = uuid.uuid4()
>>> _id.urn
'urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee'
Whether search engines will consider a URL or a URN or a random str without dashes to be one searchable-for token is pretty ironic in terms of extracting relations between resources in a Linked Data hypergraph. >>> _id.hex
'4c466878a81b4f22a112c704655fa4ee'
The relation between a resource and a Thing with a URI/URN/URL can be expressed with https://schema.org/about . In JSON-LD ("JSONLD"): {"@context": "https://schema.org",
"@type": "WebPage",
"about": {
"@type": "SoftwareApplication",
"identifier": "urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee",
"url": ["", ""],
"name": [
"a schema.org/SoftwareApplication < CreativeWork < Thing",
{"@value": "a rose by any other name",
"@language": "en"}]}}
Or with RDFa: <body vocab="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" rel="nofollow noopener" target="_blank">https://schema.org/" typeof="WebPage">
<div property="about" typeof="SoftwareApplication">
<meta property="identifier" content="urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee"/>
<span property="name">a schema.org/SoftwareApplication < CreativeWork < Thing</span>
<span property="name" lang="en">a rose by any other name</span>
</div>
</body>
Or with Microdata: <div itemtype="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" rel="nofollow noopener" target="_blank">https://schema.org/WebPage" itemscope>
<link itemprop="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" rel="nofollow noopener" target="_blank">http://www.w3.org/ns/rdfa#usesVocabulary" href="https://schema.org/" />
<div itemprop="about" itemtype="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">https://schema.org/SoftwareApplication" itemscope>
<meta itemprop="identifier" content="urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee" />
<meta itemprop="name" content="a <a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">http://schema.org/SoftwareApplication" rel="nofollow noopener" target="_blank">schema.org/SoftwareApplication < CreativeWork < Thing"/>
<meta itemprop="name" content="a rose by any other name" lang="en"/>
</div>
</div>The Unix timestamp will begin with 16 this Sunday
Redox: Unix-Like Operating System in Rust
I'm Jeremy Soller, the creator of Redox OS. Let me know if you have questions!
Are there tools to support static analysis and formal methods in Rust yet?
From https://news.ycombinator.com/item?id=21839514 re: awesome-safety-critical https://awesome-safety-critical.readthedocs.io/en/latest/ :
> > Does Rust have a chance in mission-critical software? (currently Ada and proven C niches) https://www.reddit.com/r/rust/comments/5iv5j7/does_rust_have...
FWIU, Sealed Rust is in progress.
And there's also RustPython for the userspace.
Ask HN: How are online communities established?
HN, Reddit, Stack Overflow, etc. are all established communities with users. How do you start a community when you don't have any users?
You should read the book People Powered by Jono Bacon. Has some really good insights and is a 101 course on exactly this.
Seconded. "People Powered: How Communities Can Supercharge Your Business, Brand, and Teams" (2019) https://g.co/kgs/CF5TEk
"The Art of Community: Building the New Age of Participation" (2012) https://g.co/kgs/P2V1kn
"Tribes: We need you to lead us" (2011) https://g.co/kgs/T8jaFS
The 1% 'rule' https://en.wikipedia.org/wiki/1%25_rule_(Internet_culture) :
> In Internet culture, the 1% rule is a rule of thumb pertaining to participation in an internet community, stating that only 1% of the users of a website add content, while the other 99% of the participants only lurk. Variants include the 1–9–90 rule (sometimes 90–9–1 principle or the 89:10:1 ratio),[1] which states that in a collaborative website such as a wiki, 90% of the participants of a community only consume content, 9% of the participants change or update content, and 1% of the participants add content.
... Relevant metrics:
- Marginal cost of service https://en.wikipedia.org/wiki/Marginal_cost
- Customer acquisition cost: https://en.wikipedia.org/wiki/Customer_acquisition_cost
- [Quantifiable and non-quantifiable] Customer Lifetime Value: https://en.wikipedia.org/wiki/Customer_lifetime_value
Last words of the almost-cliche community organizer surrounded by dormant accounts: "Network effects will result in sufficient (grant) funding"
Business model examples that may be useful for building and supporting sustainable communities with clear Missions, Objectives, and Criteria for Success: https://gist.github.com/ndarville/4295324
Python Documentation Using Sphinx
I usually generate new Python projects with a cookiecutter; such as cookiecutter-pypackage. I like the way that cookiecutter-pypackage includes a Makefile which has a `docs` task so that I can call `make docs` to build the sphinx docs in the docs/ directory which include:
- a /docs/readme.rst that includes the /README.rst as the first document in the toctree
- a sensible set of default documents: readme (.. include:: /README.rst), installation, usage, modules (sphinx-autodoc output), contributing, authors, history (.. include:: /HISTORY.rst)
- a sphinx conf.py that sets the docs' version and release attributes to pkgname.__version__; so that the version number only needs to be changed in one place (as long as setup.py or setup.cfg also read the version string from pkgname.__version__)
- a default set of extensions: ['sphinx.ext.autodoc', 'sphinx.ext.viewcode'] that generates API docs and includes '[source]' hyperlinks from the generated API docs to the transcluded syntax-highlighted source code and links back to the API docs from the source code
https://github.com/audreyfeldroy/cookiecutter-pypackage/tree...
There are a few styles of docstrings that Sphinx can parse and include in docs with e.g. sphinx-autodoc:
`:param, :type, :returns, :rtype` docstrings (which OP uses; and which pycontracts can read runtime parameter and return type contracts from https://andreacensi.github.io/contracts/ (though Python 3 annotations are now the preferred style for compile or editing-time typechecks))
Numpydoc docstrings: https://numpydoc.readthedocs.io/en/latest/format.html
Googledoc docstrings: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/
You can use Markdown with Sphinx in at least three ways:
MyST Markdown supports Sphinx and Docutils roles and directives. Jupyter Book builds upon MyST Markdown. With Jupyter Book, you can include Jupyter notebooks (which can include MyST Markdown) in your Sphinx docs. Executable notebooks are a much easier way to include up-to-date code outputs in docs. https://myst-parser.readthedocs.io/en/latest/
Sphinx (& ReadTheDocs) w/ recommonmark: https://docs.readthedocs.io/en/stable/intro/getting-started-...
Nbsphinx predates Jupyter Book and doesn't yet support MyST Markdown, but does support Markdown cells in Jupyter notebooks. Nbsphinx includes a parser for including .ipynb Jupyter notebooks in Sphinx docs. nbsphinx supports raw RST (ReST) cells in Jupyter notebooks and has great docs: https://nbsphinx.readthedocs.io/en/latest/
Nbdev is another approach; though it's not Sphinx:
> nbdev is a library that allows you to fully develop a library in Jupyter Notebooks, putting all your code, tests and documentation in one place.
> [...] Add %nbdev_export flags to the cells that define the functions you want to include in your python modules
https://github.com/fastai/nbdev
A few additional sources of docs for Sphinx and ReStructuredText:
Read The Docs docs > Getting Started with Sphinx > External Resources https://docs.readthedocs.io/en/stable/intro/getting-started-...
CPython Devguide > "Documenting Python" https://devguide.python.org/documenting/
"How to write [Linux] kernel documentation" https://www.kernel.org/doc/html/latest/doc-guide/index.html
awesome-sphinxdoc: https://github.com/yoloseem/awesome-sphinxdoc
... "Ask HN: Recommendations for Books on Writing [for engineers]?" https://news.ycombinator.com/item?id=23945580
Traits of good remote leaders
And the evidence for these considerations is what?
They reference a study: https://link.springer.com/article/10.1007%2Fs10869-020-09698..., titled: "Who Emerges into Virtual Team Leadership Roles? The Role of Achievement and Ascription Antecedents for Leadership Emergence Across the Virtuality Spectrum".
Fortunately the references are free to view.
"Table 4 – Correlation of Development Phases, Coping Stages and Comfort Zone transitions and the Performance Model" in "From Comfort Zone to Performance Management" White (2008) tabularly correlates the Tuckman group development phases (Forming, Storming, Norming, Performing, Adjourning) with the Carnall coping cycle (Denial, Defense, Discarding, Adaptation, Internalization) and Comfort Zone Theory (First Performance Level, Transition Zone, Second Performance Level), and the White-Fairhurst TPR model (Transforming, Performing, Reforming). The ScholarlyArticle also suggests management styles for each stage (Commanding, Cooperative, Motivational, Directive, Collaborative); and suggests that team performance is described by chained power curves of re-progression through these stages.
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=%E2...
IDK what's different about online teams in regards to performance management?
Show HN: Eiten – open-source tool for portfolio optimization
Is it possible to factor (e.g. GRI) sustainability criteria into the portfolio fitness function? https://news.ycombinator.com/item?id=21922558
My concern is that - like any other portfolio optimization algorithm - blindly optimizing on fundamentals and short term returns will lead to investing in firms who just dump external costs onto people in the present and future; so, screening with sustainability criteria is important to me.
From https://news.ycombinator.com/item?id=19111911 :
> awesome-quant lists a bunch of other tools for algos and superalgos: https://github.com/wilsonfreitas/awesome-quant
This is perfect. Thank you for sharing this.
I might start implementing some of these but would love for someone else to add a few PRs as well. The code is pretty modular especially if we want to add new strategies.
(Sustainable) Index ETFs in the stocks.txt universe would likely be less sensitive to single performers' effects in unbalanced portfolios.
> pyfolio.tears.create_interesting_times_tear_sheet measures algorithmic trading algorithm performance during "stress events" https://github.com/quantopian/pyfolio/blob/03568e0f328783a6a...
Ask HN: Any well funded tech companies tackling big, meaningful problems?
Are there any well funded tech startups / companies tackling major societal problems? Any of these fair game: https://en.wikipedia.org/wiki/List_of_global_issues
----
I don't see or hear of any and want to know if this is just my bias or if there really is a shortage of resources in tech being allocated to solving the worlds most important problems. I'm sure I'm not the only engineer that's looking out for companies like this.
Ran into this previous Ask HN (https://news.ycombinator.com/item?id=24168902) that asked a similar question. However, here I wanna focus on the better funded efforts (not side projects, philanthropy etc).
One example I've heard so far is Tesla. Any others?
You can make an impact by solving important local and global problems by investing your time, career, and savings; by listing and comparing solutions.
As a labor market participant, you can choose to work for places that have an organizational mission that strategically aligns with local, domestic, and international objectives.
https://en.wikipedia.org/wiki/Strategic_alignment ... "Schema.org: Mission, Project, Goal, Objective, Task" https://news.ycombinator.com/item?id=12525141
As an investor, you can choose to invest in organizations that are making the sort of impact you're looking for: you can impact invest.
https://en.wikipedia.org/wiki/Impact_investing
You mentioned "List of global issues"; which didn't yet have a link to the UN Sustainable Development Goals (the #GlobalGoals). I just added this to the linked article:
> As part of the 2030 Agenda for Sustainable Development, the UN Millenium Development Goals (2000-2015) were superseded by the UN Sustainable Development Goals (2016-2030), which are also known as The Global Goals. There are associated Targets and Indicators for each Global Goal.
There are 17 Global Goals.
Sustainability reporting standards can align with the Sustainable Development Goals. For example, the GRI standards are now aligned with the UN Sustainable Development Goals.
https://en.wikipedia.org/wiki/Sustainable_Development_Goals
Investors, fund managers, and potential employees can identify companies which are making an impact by reviewing corporate sustainability and ESG reports.
From https://www.undp.org/content/undp/en/home/sustainable-develo... :
> SDG Target 12.6: "Encourage companies, especially large and transnational companies, to adopt sustainable practices and to integrate sustainability information into their reporting cycle"
From https://news.ycombinator.com/item?id=21302926 :
> > What are some of the corporate sustainability reporting standards?
> > From https://en.wikipedia.org/wiki/Sustainability_reporting#Initi... :
> >> Organizations can improve their sustainability performance by measuring (EthicalQuote (CEQ)), monitoring and reporting on it, helping them have a positive impact on society, the economy, and a sustainable future. The key drivers for the quality of sustainability reports are the guidelines of the Global Reporting Initiative (GRI),[3] (ACCA) award schemes or rankings. The GRI Sustainability Reporting Guidelines enable all organizations worldwide to assess their sustainability performance and disclose the results in a similar way to financial reporting.[4] The largest database of corporate sustainability reports can be found on the website of the United Nations Global Compact initiative.
> >The GRI (Global Reporting Initiative) Standards are now aligned with the UN Sustainable Development Goals (#GlobalGoals). https://en.wikipedia.org/wiki/Global_Reporting_Initiative
> >> In 2017, 63 percent of the largest 100 companies (N100), and 75 percent of the Global Fortune 250 (G250) reported applying the GRI reporting framework.[3]
What are some good ways to search for companies who (1) do sustainability reports, (2) engage in strategic alignment in corporate planning sessions, (3) make sustainability a front-and-center issue in their company's internal and external communications?
What are some examples of companies who have a focus on sustainability and/or who have developed a nonprofit organization for philanthropic missions which are sometimes best accounted for as a distinct organization or a business unit (which can accept and offer receipts for donations as a non-profit)?
How can an employee drive change in a small or a large company? Identify opportunities to deliver value and goodwill. Read through the Global Goals, Targets, and Indicators; and get into the habit of writing down problems and solutions.
3 pillars of [Corporate] Sustainability: (Environment (Society (Economy))). https://en.wikipedia.org/wiki/Sustainability#Three_dimension...
"Launch HN: Charityvest (YC S20) – Employee charitable funds and gift matching" https://news.ycombinator.com/item?id=23907902 :
> We created a modern, simple, and affordable way for companies to include charitable giving in their suite of employee benefits.
> We give employees their own tax-deductible charitable giving fund, like an “HSA for Charity.” They can make contributions into their fund and, from their fund, support any of the 1.4M charities in the US, all on one tax receipt.
> Using the funds, we enable companies to operate gift matching programs that run on autopilot. Each donation to a charity from an employee is matched automatically by the company in our system.
> A company can set up a matching gift program and launch giving funds to employees in about 10 minutes of work.
"Salesforce Sustainability Cloud Becomes Generally Available" https://news.ycombinator.com/item?id=22068522 :
> Are there similar services for Sustainability Reporting and accountability?
Column Names as Contracts
I really like the idea of being more thoughtful about naming columns and being more explicit about the “type” of data contained in them.
Is this idea already known among data modelers or data engineers?
I’d love to read any other references, if available.
I am a data architect in my day job. Within the realm of data management, I'd say "metadata management" [1] is the general category this fits within.
I would say, yes this idea is known/very common, as data architecture is as much about the descriptive language we use as anything. I mean, "business glossaries", taxonomy, even just naming conventions [2] in coding, these are all related.
If you build enough databases/tables or even code yourself, you inevitably come across the "how to name things" problem [3]. If all you have to sort on for the known meaning of a thing (column, table, file, etc.) is a single string value, then encoding meaning into it is quite common. This way, a sort creates a kind of "grouping". Many database vendors follow standard naming conventions - such as Oracle, for example [4]. It is considered a best practice when designing/building the metadata for a large system, to establish a naming convention. Among other things, it makes finding things easier, as well as all the potential for automation.
You get all kinds of variations on this, such as, should the "ID_" come as a prefix or a suffix (i.e. "_ID"). One's initial thought is to use it as a prefix so all the related types group together, but then that becomes much more difficult if you want to sort items by their functional area (e.g. DRIVER_ID, DRIVER_IND, etc.).
One other place you see something similar is in "smart numbers" which is an eternal argument - should I use a "dumb identifier" (GUID, integer) or a "smart one" (one encoding additional meaning) [5].
I mean, basically, any time you can encode information in the meta-data of data, I think you can then operate on it by following "convention over configuration" (as mentioned elsewhere in the discussion comments).
The only problem I see is that such conventions can, at times be limiting - depending on the length of your metadata columns, and the variability you are trying to capture - which is why I believe, generally, metadata is often better separated and linked to the data it describes - this decoupling allows for much more descriptive metadata than one could encode in simple a single string value. Certainly, you can get a long way with an approach like this, but I suspect you would run into 80/20 rule limitations.
Using naming in this way is a form of tight coupling, which could be seen as an anti-pattern in terms of meta-data flexibility, in some cases.
[1] https://en.wikipedia.org/wiki/Metadata_management
[2] https://en.wikipedia.org/wiki/Naming_convention_(programming...
[3] https://martinfowler.com/bliki/TwoHardThings.html
[4] https://oracle-base.com/articles/misc/naming-conventions
In terms of database normalization, delimiting multiple fields within a column name field violates the "atomic columns" requirement of the first though sixth normal forms (1NF - 6NF)
https://en.wikipedia.org/wiki/Database_normalization
Are there standards for storing columnar metadata (that is, metadata about the columns; or column-level metadata)?
In terms of columns, SQL has (implicit ordinal, name, type) and then primary key, index, and [foreign key] constraints.
RDFS (RDF Schema) is an open W3C linked data standard. An rdf:Property may have a rdfs:domain and a rdfs:range; where the possible datatypes are listed as instances of rdfs:range. Primitive datatypes are often drawn from XSD (XML Schema Definition), or https://schema.org/ . An rdfs:Class instance may be within the rdfs:domain and/or the rdfs:range of an rdf:Property.
RDFS is generally not sufficient for data validation; there are a number of standards which build upon RDFS: W3C SHACL (Shapes and Constraint Language), W3C CSVW (CSV on the Web).
There is some existing work on merging JSON Schema and SHACL.
CSVW builds upon the W3C "Model for Tabular Data and Metadata on the Web"; which supports arbitrary "annotations" on columns. CSVW can be represented as any RDF representation: Turtle/Trig/M3, RDF/XML, JSON-LD.
https://www.w3.org/TR/tabular-data-primer/
https://www.w3.org/TR/tabular-data-model/ :
> an annotated tabular data model: a model for tables that are annotated with metadata. Annotations provide information about the cells, rows, columns, tables, and groups of tables […]
...
From https://twitter.com/westurner/status/901992073846456321 :
> "7 metadata header rows (column label, property URI path, DataType, unit, accuracy, precision, significant figures)" https://wrdrd.github.io/docs/consulting/linkedreproducibilit...
...
From https://twitter.com/westurner/status/1295774405923147778 :
> Relevant: https://discuss.ossdata.org/ topics: "Linked Data formats, tools, challenges, opportunities; CSVW, https://schema.org/Dataset , https://schema.org/ScholarlyArticle " https://discuss.ossdata.org/t/linked-data-formats-tools-chal...
> "A dataframe protocol for the PyData ecosystem" https://discuss.ossdata.org/t/a-dataframe-protocol-for-the-p...
> A .meta protocol should implement the W3C Tabular Data Model: [...]
...
The various methods of doing CSV2RDF and R2RML (SQL / RDB to RDF Mapping) each have a way to specify additional metadata annotations. None stuff data into a column name (which I'm also guilty of doing with e.g. "columnspecs" in a small line-parsing utility called pyline that can cast columns to Python types and output JSON lines).
...
Even JSON5 is insufficient when it comes to representing e.g. complex fractions: there must be a tbox (schema) in order to read the data out of the abox (assertions; e.g. JSON). JSON-LD is sufficient for representation; and there are also specs like RDFS, SHACL, and CSVW.
Abox: https://en.wikipedia.org/wiki/Abox
I see the line of thinking you're going down. There are ISO standards for data types, in a sense I could see why one would seek a standard language for defining the metadata/specification of a type as data. Have to really think about that some more.. in a way a regex could be seen as a compact form of expressing the capability of a column in terms of value ranges or domains, but to define the meaning of the data, not so much.
Your interpretation of the atomic columns requirement is a little different than my understanding. That requirement of normalization only applies to the "cells" of columnar data, it says nothing about encoding meaning into column names, which are themselves simply descriptive metadata.
I mean, for sure you wouldn't want to encode many values/meanings into a column name (some systems have length restrictions that would make that impossible, I'm not sure it makes sense anyway), but just pointing out that technically the spec does not make that illegal. Certainly, adding minor annotations within the name of a column separated by a supported delimiter does not, in my opinion, violate normalization rules at all. I mean things like "ID_" or similar.
Have you looked at INFORMATION_SCHEMA in SQL databases? [1] You mentioned SQL metadata and constraints, that is as close to a standard feature for querying that information there is, some databases do it using similar but non-standard ways (Oracle for example).
Also, not standard but, many relational databases support extended properties or Metadata for objects (tables, views, columns, etc.) - you can often come up with your own scheme although rarely do I see people utilize these features. [2] [3]
At some point it feels like we are more talking about type definitions and annotations, applied to data columns.
Maybe like, BNF [4] for purely data table columns (which are essentially types)?
[1] https://en.wikipedia.org/wiki/Information_schema
[2] http://www.postgresql.org/docs/current/static/sql-comment.ht...
[3] https://docs.microsoft.com/en-us/sql/relational-databases/sy...
Graph Representations for Higher-Order Logic and Theorem Proving (2019)
ONNX (and maybe RIF) are worth mentioning.
ONNX: https://onnx.ai/ :
> ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers
RIF (~FOL): https://en.wikipedia.org/wiki/Rule_Interchange_Format
Datalog (not Turing-complete): https://en.wikipedia.org/wiki/Datalog
HOList Benchmark: https://sites.google.com/view/holist/home
"HOList: An Environment for Machine Learning of Higher-Order Theorem Proving" (2019) https://arxiv.org/abs/1904.03241
> Abstract: We present an environment, benchmark, and deep learning driven automated theorem prover for higher-order logic. Higher-order interactive theorem provers enable the formalization of arbitrary mathematical theories and thereby present an interesting, open-ended challenge for deep learning. We provide an open-source framework based on the HOL Light theorem prover that can be used as a reinforcement learning environment. HOL Light comes with a broad coverage of basic mathematical theorems on calculus and the formal proof of the Kepler conjecture, from which we derive a challenging benchmark for automated reasoning. We also present a deep reinforcement learning driven automated theorem prover, DeepHOL, with strong initial results on this benchmark.
I wonder why they don't mention any work based on transformer architectures? The recent work on solving differential equations based on expressions in reverse polish notation seemed like a reasonable idea to apply to theorem proving as well.
Really cool work though!
A transformer is unable to really represent logic, let alone higher order logic and theorem proving.
A transformer is a universal function approximator. The question is whether it can do so reasonably efficiently. Trained on natural language linear sequences, I’m with you. Trained on abstract logical graph representations? I don’t think that question’s answered yet, unless I’m missing something.
How do transformers handle with truth tables, logical connectives, and propositional logic / rules of inference, and first-order logic?
Truth table: https://en.wikipedia.org/wiki/Truth_table
Logical connective: https://en.wikipedia.org/wiki/Logical_connective
Propositional logic: https://en.wikipedia.org/wiki/Propositional_calculus
Rules of inference: https://en.wikipedia.org/wiki/Rule_of_inference
DL: Description logic: https://en.wikipedia.org/wiki/Description_logic (... The OWL 2 profiles (EL, QR, RL; DL, Full) have established decideability and complexity: https://www.w3.org/TR/owl2-profiles/ )
FOL: First-order logic: https://en.wikipedia.org/wiki/First-order_logic
HOL: Higher-order logic: https://en.wikipedia.org/wiki/Higher-order_logic
In terms of regurgitating without critical reasoning?
Critical reasoning: https://en.wikipedia.org/wiki/Critical_thinking
Think about a program that can write code but not execute it. It’s not hard to get a transformer to learn to write python code like merge sort or simple arithmetic code, even though a transformer can’t reasonably learn to sort, nor can it learn simple arithmetic. It’s an important disambiguation. In one view it appears it can’t (learn to sort) and in another it demonstrably can (learn to code up a sort function). It can learn to usefully manipulate the language without needing the capacity to execute the language. They probably can’t learn to “execute” what you’re talking about (execute being a loose analogy), but I’d say the jury’s out on whether they can learn to usefully manipulate it.
Transformer is a function from seq of symbols to seq of symbols. For example, truth table is exactly similar kind of function from variables to [True, False] alphabet.
Transformer can represent some very complex logical operations, and per this article is turing complete: https://arxiv.org/abs/1901.03429, meaning any computable function, including theorem prover can be represented as transformer.
Another question is if it is feasible/viable/rational to build transformer for this? My intuition says: no.
Show HN: Linux sysadmin course, eight years on
Almost eight years ago I launched an online “Linux sysadmin course for newbies” here at HN.
It was a side-project that went well, but never generated enough money to allow me to fully commit to leaving the Day Job. After surviving the Big C, and getting made redundant I thought I might improve and relaunch it commercially – but my doctors are a pessimistic bunch, so it looked like I didn’t have the time.
Instead, I rejigged/relaunched it via a Reddit forum this February as free and open - and have now gathered a team of helpers to ensure that it keeps going each month even after I can’t be involved any longer.
It’s a month-long course which restarts each month, so “Day 1” of September is this coming Monday.
It would be great if you could pass the word on to anyone you know who may be the target market of those who: “...aspire to get Linux-related jobs in industry - junior Linux sysadmin, devops-related work and similar”.
[0] http://www.linuxupskillchallenge.org/
[1] https://www.reddit.com/r/linuxupskillchallenge/
[2] http://snori74.blogspot.com/2020/04/health-status.html
There are a number of resources that may be useful for your curriculum for this project listed in "Is there a program like codeacademy but for learning sysadmin?" https://news.ycombinator.com/item?id=19469266 :
> [ http://www.opsschool.org/ , https://github.com/kahun/awesome-sysadmin/blob/master/README... , https://github.com/stack72/ops-books , https://landing.google.com/sre/books/ , https://response.pagerduty.com/ (Incident Response training)]
To that I'd add that K3D (based on K3S, which is now a CNCF project) runs Kubernetes (k8s) in Docker containers. https://github.com/rancher/k3d
For zero-downtime (HA: High availability) deployments, "Zero-Downtime Deployments To a Docker Swarm Cluster" describes Rolling Updates and Blue-Green Deployments; with illustrations: https://github.com/vfarcic/vfarcic.github.io/blob/master/doc...
For git-push style deployment with more of a least privileges approach (which also has more moving parts) you could take a look at: https://github.com/dokku/dokku-scheduler-kubernetes#function...
And also reference ansible molecule and testinfra for writing sysadmin tests and the molecule vagrant driver for testing docker configurations. https://www.jeffgeerling.com/blog/2018/testing-your-ansible-...
https://molecule.readthedocs.io/en/latest/
https://testinfra.readthedocs.io/en/latest/ :
> With Testinfra you can write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
> Testinfra aims to be a Serverspec equivalent in python and is written as a plugin to the powerful Pytest test engine.
I wasn't able to find a syllabus or a list of all of the daily posts? Are you focusing on DevOps and/or DevSecOps skills?
EDIT: The lessons are Markdown files in a Git repo: https://github.com/snori74/linuxupskillchallenge
Links to each lesson, the title and/or subjects of the lesson, and the associated reddit posts might be useful in a Table of Contents in the README.md.
Thanks, but most of that would be way over the top for my "newbies".
However, You must be the third or fourth person today to suggest that I add a TOC - so that is something I think I'll need to look at!
Maybe most useful as resources for further study.
Looks like Day 20 covers shell scripting. A few things worth mentioning:
You can write tests for shell scripts and write TAP (Test Anything Protocol) -formatted output: https://testanything.org/producers.html#shell
Quoting in shell scripts is something to be really careful about:
> This and this do different things:
# prints a newline
echo $(echo "-e a\nb")
# prints "-e a\nb"
echo "$(echo "-e a\nb")"
Shellcheck can identify some of those types of (security) bugs/errors/vulns in shell scripts: https://www.shellcheck.net/LearnXinYminutes has a good bash reference: https://learnxinyminutes.com/docs/bash/
And an okay Ansible reference, which (like Ops School) we should contribute to: https://learnxinyminutes.com/docs/ansible/
Why do so many pros avoid maintaining shell scripts and writing one-off commands that they'll never remember to run again later?
...
It may be helpful to format these as Jupyter notebooks with input and output cells.
- Ctrl-Shift-Minus splits a cell at the cursor
- M and Y toggle a cell between Markdown and code
If you don't want to prefix every code cell line with a '!' so that the ipykernel Jupyter python kernel (the default kernel) executes the line with $SHELL, you can instead install and select bash_kernel; though users attempting to run the notebooks interactively would then need to also have bash_kernel installed: https://github.com/takluyver/bash_kernel
You can save a notebook .ipynb to any of a number of Markdown and non-Markdown formats https://jupytext.readthedocs.io/en/latest/formats.html#markd... ; unfortunately jupytext only auto-saves to md without output cell content for now: https://github.com/mwouts/jupytext/issues/220
You can make reveal.js slides (that do include outputs) from a notebook: https://gist.github.com/mwouts/04a6dfa571bda5cc59fa1429d1309...
With nbconvert, you can manually save an .ipynb Jupyter notebook as Markdown which includes the cell outputs w/ File > "Download as / Export Notebook as" > "Export notebook to Markdown" or with the CLI: https://nbconvert.readthedocs.io/en/latest/usage.html#conver...
jupyter convert --to markdown
jupyter convert --help
With Jupyter Book,
you can build an [interactive] book as HTML and/or PDF from multiple Jupyter notebooks as e.g. Markdown documents
https://jupyterbook.org/intro.html : jupyter-book build mybook/
...From https://westurner.github.io/tools/#bash :
type bash
bash --help
help help
help type
apropos bash
info bash
man bash
man man
info info
From https://news.ycombinator.com/item?id=22980353 ; this is how dotfiles work: info bash -n "Bash Startup Files"
> https://www.gnu.org/software/bash/manual/html_node/Bash-Star... ...
Re: dotfiles, losing commands that should've been logged to HISTFILE when running multiple bash sessions and why I wrote usrlog.sh: https://westurner.github.io/hnlog/#comment-20671184 (Ctrl-F for: "dotfiles", "usrlog.sh", "inputrc")
https://github.com/webpro/awesome-dotfiles
...
awesome-sysadmin > resources: https://github.com/kahun/awesome-sysadmin#resources
Software supply chain security
Estimates of prevalence do assume detection. How would we detect that a dependency that was installed a few deployments and reboots ago was compromised?
How does the classic infosec triad (Confidentiality, Integrity, Availability) apply to software supply chain security?
Confidentiality: Presumably we're talking about open source projects; which aren't confidential. Projects may request responsible disclosure in an e.g. security.txt; and vuln reports may be confidential for at least a little while.
Integrity: Secure transport protocols, checksums, and cryptographic code signing are ways to mitigate data integrity risks. GitHub supports SSH, 2FA, and GPG keys. Can all keys in the package signature keyring be used to sign any package? Can we verify a public key over a different channel? When we specify exact versions of software dependencies, can we also record package hashes which the package installer(s) will verify?
Availability: What are the internal and external data, network, and service dependencies for the development and deployment DevSecOps workflows? Can we deploy from local package mirrors? Who is responsible for securing and updating local package mirrors? Are these service dependencies all HA? Does everything in this system also depend upon the load balancer? Does our container registry support e.g. Docker Notary (TUF)? How should we mirror TUF package repos?
See also: "Guidance for [[transparent] proxy cache] partial mirrors?" https://github.com/theupdateframework/specification/issues/1...
A toolset that answers some of your questions is grafeas- a metadata store at https://github.com/grafeas/grafeas- and kritis, a policy engine at https://github.com/grafeas/kritis.
Cheers.
Thanks for the links. Do you know how this toolset helps to mitigate/prevent what is called in the GitHub blogpost "Supply chain compromises". Quickly checked around and couldn't find anything that applies to the dependencies of applications/binaries before they land into the target runtime (i.e k8s).
Have you seen these preso slides
https://www.slideshare.net/mobile/aysylu/q-con-sp-software-s....
They walk through one of the workflows (end state is deploying to k8s).
Grafeas is a metadata store, Kritis is a policy engine that plugs into k8s as an admission controller- blessing the "admission" (running) of an image in a namespace.
There are existing tools for each language/runtime that produce known vuln lists for individual artifacts in the language ecosystem. These you feed into Grafeas. And you have your CI pipeline providing manifests for each of your built images that contain all upstream dependencies (these produced from each app's build tool). Then at deploy time, Kritis checks the manifest on the image, and for each artifact in the image, checks for vulns and determines whether the vuln should keep the image from being deployed.
Hope that helps. There are many other workflows but that one is the most direct.
Cheers.
OUTSTANDING comment; excellent questions. Bookmarked. Thanks for this concise high-level infosec punchlist.
This Sir is senior.
Mind Emulation Foundation
"While we are far from understanding how the mind works, most philosophers and scientists agree that your mind is an emergent property of your body. In particular, your body’s connectome. Your connectome is the comprehensive network of neural connections in your brain and nervous system. Today your connectome is biological. "
This is a pretty speculative thesis. It's not at all clear that everything relevant to the mind is found in the connections rather than the particular biochemical processes of the brain. It's a very reductionist view that drastically underestimates the biological complexity of even individual cells. There's a good book, Wetware: A Computer in Every Living Cell, by Dennis Bray going into detail on how much functionality and physical processes are at work even in the most simplest cells that is routinely ignored by these analogies of the brain to a digital computer.
There is this extreme, and I would argue unscientific bias towards treating the mind as something that's recreatable in a digital system probably because it enables this science-fiction speculation and dreams of immortality of people living in the cloud.
Indeed. We humans largely create devices that function either through calculation or through physical reaction, relying on the underlying rules of the universe to "do the math" of, say, launching a cannonball and having it follow a consistent arc. The brain combines both at almost every level. It may be fundamentally impossible to emulate a human personality equal to a real one without a physics simulation of a human brain and its chemistry.
A dragonfly brain takes the input from thirty thousand visual receptor cells and uses it to track prey movement using only sixteen neurons. Could we do the same using an equal volume of transistors?
No one is saying a neuron is a one to one equivalent with a transistor. That behavior does seem like it's possible to emulate with many transistors, however.
Was just talking about quantum cognition and memristors (in context to GIT) a few days ago: https://news.ycombinator.com/item?id=24317768
Quantum cognition: https://en.wikipedia.org/wiki/Quantum_cognition
Memristor: https://en.wikipedia.org/wiki/Memristor
It may yet be possible to sufficiently functionally emulate the mind with (orders of magnitude more) transistors. Though, is it necessary to emulate e.g. autonomic functions? Do we consider the immune system to be part of the mind (and gut)?
Perhaps there's something like an amplituhedron - or some happenstance correspondence - that will enable more efficient simulation of quantum systems on classical silicon pending orders of magnitude increases in coherence and also error rate in whichever computation medium.
For abstract formalisms (which do incorporate transistors as a computation medium sufficient for certain tasks), is there a more comprehensive set than Constructor Theory?
Constructor theory: https://en.wikipedia.org/wiki/Constructor_theory
Amplituhedron: https://en.wikipedia.org/wiki/Amplituhedron
What is the universe using our brains to compute? Is abstract reasoning even necessary for this job?
Something worth emulating: Critical reasoning. https://en.wikipedia.org/wiki/Critical_reasoning
13 Beautiful Tools to Enhance Online Teaching and Learning Skills
"Options for giving math talks and lectures online" https://news.ycombinator.com/item?id=22541754 also lists a number of resources for teaching math online.
How close are computers to automating mathematical reasoning?
It's irresponsible to not mention that a series of results from the 1930s through the 1950s proved that generalized automated proof search is impossible, in the sense that we cannot just ask a computer to find a proof or refutation of Goldbach's Conjecture or other "Goldbach-type" statement and have it do any better than a person. In that sense, no, computers will never fully automate mathematical reasoning, because mathematical reasoning cannot be fully automated.
What we can automate, and have successfully automated many times over, is proof checking. In that sense, yes, computers have already fully automated mathematical reasoning, because checking proofs is fully formalized and mechanized.
I think that what people really ought to be asking for is the ease with which we find new results and communicate them to others. In that sense, computer-aided proofs can be hard to read and so there is much work to be done in making them easier to communicate to humans. Similarly, there is interesting work being done on how to make computer-generated proofs which use extremely-high-level axiom schemata to generate human-like handwaving abstractions.
I'm still not sure how I feel about the ATP/ITP terminology used here. ATPs are either of the weak sort that crunch through SAT, graphs, and other complete-but-hard problems, or the strong sort which are impossible. Meanwhile, folks have drifted from "interactive" to "assistant", and talk of "proof assistants" as tools which, like a human, can write down and look at sections of a proof in isolation, but cannot summon complete arbitrary proofs from its own mind.
Edit: One edit will be quicker than two replies and I tend to be "posting too fast". The main point is captured well by [0], but they immediately link to [1], the central result, which links to [2], an important corollary. At this point, I'm just going to quote WP:
> The first [Gödel] incompleteness theorem states that no consistent system of axioms whose theorems can be listed by an effective procedure (i.e., an algorithm) is capable of proving all truths about the arithmetic of natural numbers. For any such consistent formal system, there will always be statements about natural numbers that are true, but that are unprovable within the system.
> The second [Gödel] incompleteness theorem, an extension of the first, shows that the system cannot demonstrate its own consistency.
> Informally, [Tarski's Undefinability] theorem states that arithmetical truth cannot be defined in arithmetic.
I recognize that these statements may seem surprising, but they are provable and you should convince yourself of them. I recently reviewed [3] and found it to be a very precise and complete introduction to all of the relevant ideas; there's also GEB if you want something more fun.
[0] https://en.wikipedia.org/wiki/Automated_theorem_proving#Deci...
[1] https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...
[2] https://en.wikipedia.org/wiki/Tarski%27s_undefinability_theo...
[3] https://www.logicmatters.net/resources/pdfs/godelbook/GodelB...
As someone unfamiliar with the results you describe, why can’t a machine do better than a human? A smart human is better at writing proofs than a dumb one. If we had a highly advanced general artificial intelligence, why couldn’t it generate better results than the smart humans?
Or is automated proof search impossible for humans as well?
Arguably, humans require more energy per operation. So, presumably such an argument hinges upon what types of operations are performed in conducting automated proof search?
What would "automated proof search" even mean in the context of being performed by a human being?
The context here is that certain problems cannot be solved by an algorithm, which doesn't necessarily translate into "cannot be performed by a machine".
It only means that there cannot be any Turing machine capable of solving them, no matter what "operations" it represents.
The task (in terms of constructor theory) is: Find the functions that sufficiently approximate the observations and record their reproducible derivations.
Either the (unreferenced) study was actually arguing that "automated proof search" can't be done at all, or that human neural computation is categorically non-algorothmic.
Grid search of all combinations of bits that correspond to [symbolic] classical or quantum models.
Or better: evolutionary algorithms and/or neural nets.
Roger Penrose argues that human neural computation is indeed non-algorithmic in nature (see his book "The Emperor's New Mind"; 1989) and speculates that quantum processes are involved.
I don't quite understand the role of neural nets in that context, though. Those just classical computations in the end and should be bound by the same limits that every other algorithms are, shouldn't they?
That human cognition is quantum in nature - that e.g. entanglement is necessary - may be unfalsifiable.
Neuromorphic engineering has expanded since the 1980s. https://en.wikipedia.org/wiki/Neuromorphic_engineering
Quantum computing is the best known method for simulating chemical reactions and thereby possibly also neurochemical reactions. But, Is quantum computing necessary to functionally emulate human cognition?
It may be that a different computation medium can accomplish the same tasks without emulating all of the complexity of the brain.
If the brain is only classical and some people are using their brains to perform quantum computations, there may be something there.
Quantum cognition: https://en.wikipedia.org/wiki/Quantum_cognition
Quantum memristors are still elusive.
From "Quantum Memristors in Frequency-Entangled Optical Fields" (2020) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7079656/ :
> Apart from the advantages of using these devices for computation [12] (such as energy efficiency [13], compared to transistor-based computers), memristors can be also used in machine learning schemes [14,15]. The relevance of the memristor lies in its ubiquitous presence in models which describe natural processes, especially those involving biological systems. For example, memristors inherently describe voltage-dependent ion-channel conductances in the axon membrane in neurons, present in the Hodgkin–Huxley model [16,17].
> Due to the inherent linearity of quantum mechanics, it is not straightforward to describe a dissipative non-linear memory element, such as the memristor, in the quantum realm, since nonlinearities usually lead to the violation of fundamental quantum principles, such as no-cloning theorem. Nonetheless, the challenge was already constructively addressed in Ref. [18]. This consists of a harmonic oscillator coupled to a dissipative environment, where the coupling is changed based on the results of a weak measurement scheme with classical feedback. As a result of the development of quantum platforms in recent years, and their improvement in controllability and scalability, different constructions of a quantum memristor in such platforms have been presented. There is a proposal for implementing it in superconducting circuits [7], exploiting memory effects that naturally arise in Josephson junctions. The second proposal is based on integrated photonics [19]: a Mach–Zehnder interferometer can behave as a beam splitter with a tunable reflectivity by introducing a phase in one of the beams, which can be manipulated to study the system as a quantum memristor subject to different quantum state inputs.
Quantum harmonic oscillators have also found application in modeling financial markets. Quantum harmonic oscillator: https://en.wikipedia.org/wiki/Quantum_harmonic_oscillator
New framework for natural capital approach to transform policy decisions
Natural capital: https://en.wikipedia.org/wiki/Natural_capital
> Natural capital is the world's stock of natural resources, which includes geology, soils, air, water and all living organisms. Some natural capital assets provide people with free goods and services, often called ecosystem services. Two of these (clean water and fertile soil) underpin our economy and society, and thus make human life possible.
Natural capital accounting: https://en.wikipedia.org/wiki/Natural_capital_accounting
> Natural capital accounting is the process of calculating the total stocks and flows of natural resources and services in a given ecosystem or region.[1] Accounting for such goods may occur in physical or monetary terms. This process can subsequently inform government, corporate and consumer decision making as each relates to the use or consumption of natural resources and land, and sustainable behaviour.
Opportunity cost: https://en.wikipedia.org/wiki/Opportunity_cost
> When an option is chosen from alternatives, the opportunity cost is the "cost" incurred by not enjoying the benefit associated with the best alternative choice.[1] The New Oxford American Dictionary defines it as "the loss of potential gain from other alternatives when one alternative is chosen."[2] In simple terms, opportunity cost is the benefit not received as a result of not selecting the next best option. Opportunity cost is a key concept in economics, and has been described as expressing "the basic relationship between scarcity and choice". [3] The notion of opportunity cost plays a crucial part in attempts to ensure that scarce resources are used efficiently.[4] Opportunity costs are not restricted to monetary or financial costs: the real cost of output forgone, lost time, pleasure or any other benefit that provides utility should also be considered an opportunity cost. The opportunity cost of a product or service is the revenue that could be earned by its alternative use.
How do we value essential dependencies in terms of future opportunity costs?
In terms of just mental health?
"National parks a boost to mental health worth trillions: study" https://phys.org/news/2019-11-national-boost-mental-health-w...
> Visits to national parks around the world may result in improved mental health valued at about $US6 trillion (5.4 trillion euros), according to a team of ecologists, psychologists and economists
> Professor Bateman's decision-making framework focuses on the links between the environment and economy and has three components: efficiency, assessing which option generates the greatest benefit; sustainability, the effects of each option on natural capital stocks; and equity, regarding who receives the benefits of a decision and when.
Ian J. Bateman et al. "The natural capital framework for sustainably efficient and equitable decision making", Nature Sustainability (2020). DOI: 10.1038/s41893-020-0552-3 https://www.nature.com/articles/s41893-020-0552-3
Challenge to scientists: does your ten-year-old code still run?
"Ten Simple Rules for Reproducible Computational Research" http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj... :
> Rule 1: For Every Result, Keep Track of How It Was Produced
> Rule 2: Avoid Manual Data Manipulation Steps
> Rule 3: Archive the Exact Versions of All External Programs Used
> Rule 4: Version Control All Custom Scripts
> Rule 5: Record All Intermediate Results, When Possible in Standardized Formats
> Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds
> Rule 7: Always Store Raw Data behind Plots
> Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
> Rule 9: Connect Textual Statements to Underlying Results
> Rule 10: Provide Public Access to Scripts, Runs, and Results
... You can get a free DOI for and archive a tag of a Git repo with FigShare or Zenodo.
... re: [Conda and] Docker container images https://news.ycombinator.com/item?id=24226604 :
> - repo2docker (and thus BinderHub) can build an up-to-date container from requirements.txt, environment.yml, install.R, postBuild and any of the other dependency specification formats supported by REES: Reproducible Execution Environment Standard; which may be helpful as Docker Hub images will soon be deleted if they're not retrieved at least once every 6 months (possibly with a GitHub Actions cron task)
BinderHub builds a container with the specified versions of software and installs a current version of Jupyter Notebook with repo2docker, and then launches an instance of that container in a cloud.
“Ten Simple Rules for Creating a Good Data Management Plan” http://journals.plos.org/ploscompbiol/article?id=10.1371/jou... :
> Rule 6: Present a Sound Data Storage and Preservation Strategy
> Rule 8: Describe How the Data Will Be Disseminated
... DVC: https://github.com/iterative/dvc
> Data Version Control or DVC is an open-source tool for data science and machine learning projects. Key features:
> - Simple command line Git-like experience. Does not require installing and maintaining any databases. Does not depend on any proprietary online services. Management and versioning of datasets and machine learning models. Data is saved in S3, Google cloud, Azure, Alibaba cloud, SSH server, HDFS, or even local HDD RAID.
> - Makes projects reproducible and shareable; helping to answer questions about how a model was built.
There are a number of great solutions for storing and sharing datasets.
... "#LinkedReproducibility"
Open textual formats for data and open source application and system software (more precisely, FLOSS), are just as important.
Imagine that x86 - and with it, the PC platform - gets replaced by ARM within a decade. For binary software, this would be a kind of geological extinction event.
The likelihood of there being a [security] bug discovered in a given software project over any significant period of time is near 100%.
It's definitely a good idea to archive source and binaries and later confirm that the output hasn't changed with and without upgrading the kernel, build userspace, execution userspace, and PUT/SUT Package/Software Under Test.
- Specify which versions of which constituent software libraries are utilized. (And hope that a package repository continues to serve those versions of those packages indefinitely). Examples: Software dependency specification formats like requirements.txt, environment.yml, install.R
- Mirror and archive all dependencies and sign the collection. Examples: {z3c.pypimirror, eggbasket, bandersnatch, devpi as a transparent proxy cache}, apt-cacher-ng, pulp, squid as a transparent proxy cache
- Produce a signed archive which includes all requisite software. (And host that download on a server such that data integrity can be verified with cryptographic checksums and/or signatures.) Examples: Docker image, statically-linked binaries, GPG-signed tarball of a virtualenv (which can be made into a proper package with e.g. fpm), ZIP + GPG signature of a directory which includes all dependencies
- Archive (1) the data, (2) the source code of all libraries, and (3) the compiled binary packages, and (4) the compiler and build userspace, and (5) the execution userspace, and (6) the kernel. Examples: Docker can solve for 1-5, but not 6. A VM (virtual machine) can solve for 1-5. OVF (Open Virtualization Format) is an open spec for virtual machine images, which can be built with a tool like Vagrant or Packer (optionally in conjunction with a configuration management tool like Puppet, Salt, Ansible).
When the application requires (7) a multi-node distributed system configuration, something like docker-compose/vagrant/terraform and/or a configuration management tool are pretty much necessary to ensure that it will be possible to reproducibly confirm the experiment output at a different point in spacetime.
A deep dive into the official Docker image for Python
One thing I've learned is that rather than use a docker-entrypoint.sh, most Linux software can be ran just using the `--user 1000:1000` or whatever UID/GID you want to use, as long as you map a volume that can use those permissions. It is a lot cleaner this way.
> Why Tini?
> Using Tini has several benefits:
> - It protects you from software that accidentally creates zombie processes, which can (over time!) starve your entire system for PIDs (and make it unusable).
> - It ensures that the default signal handlers work for the software you run in your Docker image. For example, with Tini, SIGTERM properly terminates your process even if you didn't explicitly install a signal handler for it.
> - It does so completely transparently! Docker images that work without Tini will work with Tini without any changes.
[...]
> NOTE: If you are using Docker 1.13 or greater, Tini is included in Docker itself. This includes all versions of Docker CE. To enable Tini, just pass the `--init` flag to docker run.
https://github.com/krallin/tini#why-tini
I didn't know it was included by default. I'll check it out, thanks!
There are Alpine [1] and Debian [2] miniconda images (within which you can `conda install python==3.8` and 2.7 and 3.4 in different conda envs)
[1] https://github.com/ContinuumIO/docker-images/blob/master/min...
[2] https://github.com/ContinuumIO/docker-images/blob/master/min...
If you build manylinux wheels with auditwheel [3], they should install without needing compilation for {CentOS, Debian, Ubuntu, and Alpine}; though standard Alpine images have MUSL instead of glibc by default, this [4] may work:
echo "manylinux1_compatible = True" > $PYTHON_PATH/_manylinux.py
[3] https://github.com/pypa/auditwheel[4] https://github.com/docker-library/docs/issues/904#issuecomme...
The miniforge docker images aren't yet [5][6] multi-arch, which means it's not as easy to take advantage of all of the ARM64 / aarch64 packages that conda-forge builds now.
[5] https://github.com/conda-forge/docker-images/issues/102#issu...
[6] https://github.com/conda-forge/miniforge/issues/20
There are i686 and x86-64 docker containers for building manylinux wheels that work with many distros: https://github.com/pypa/manylinux/tree/master/docker
A multi-stage Dockerfile build can produce a wheel in the first stage and install that wheel (with `COPY --from=0`) in a later stage; leaving build dependencies out of the production environment for security and performance: https://docs.docker.com/develop/develop-images/multistage-bu...
Interesting! I use miniconda extensively for local development to manage virtual environments for different python versions and love it. I hardly ever actually use the conda packages though.
I assume the main benefit of using these images would be if you are installing from conda repos instead of pip? Otherwise just using the official python images would be as good if not better
Edit: I guess if you needed multiple python versions in a single container this would be a good solution for that as well
Use cases for conda or conda+pip:
- Already-compiled packages (where there may not be binary wheels) instead of requiring reinstallation and subsequent removal of e.g. build-essentials for every install
- Support for R, Julia, NodeJS, Qt, ROS, CUDA, MKL, etc.
- Here's what the Kaggle docker-python Dockerfile installs with conda and with pip: https://github.com/Kaggle/docker-python/blob/master/Dockerfi...
- Build matrix in one container with conda envs
Disadvantages of the official python images as compared with conda+pip:
- Necessary to (re)install build dependencies and a compiler for every build (if there's not a bdist or a wheel for the given architecture) and then uninstall all unnecessary transitive dependencies. This is where a [multi-stage] build of a manylinux wheel may be the best approach.
- No LSM (AppArmor, SELinux, ) for one or more processes in the container (which may have read access to /etc or environment variables and/or --privileged)
- Necessary to build basically everything on non x86[-64] architectures for every container build
Disadvantages of conda / conda+pip:
- Different package repo infrastructure to mirror
- Users complaining that they don't need conda who then proceed to re-download and re-build wheels locally multiple times a day
Additional attributes for comparison:
- The new pip solver (which is slower than the traditional iterative non-solver), conda, and mamba
- repo2docker (and thus BinderHub) can build an up-to-date container from requirements.txt, environment.yml, install.R, postBuild and any of the other dependency specification formats supported by REES: Reproducible Environment Execution Standard; which may be helpful as Docker Hub images will soon be deleted if they're not retrieved at least once every 6 months (possibly with a GitHub Actions cron task)
Quite a few conda packages have patches added by the conda team to help fix problems in packages relying on native code or binaries. Particularly on Windows. If something is available on the primary conda repos it will almost assuredly work with few of any problems cross-platform, whereas pip is hit or miss.
If you’re always on Linux you may never appreciate it but some pip packages are a nightmare to get working properly on Windows.
If you look through the source of the conda repos, you’ll see all kinds of small patches to fix weird and breaking edge cases, particularly in libs with significant C back ends.
Here's the meta.yml for the conda-forge/python-feedstock: https://github.com/conda-forge/python-feedstock/blob/master/...
It includes patches just like distro packages often do.
The Consortium for Python Data API Standards
This actually strikes me as a huuuge waste of effort. I work with every one of the different technologies they mention every day.
The differences and idiosyncrasies are truly, truly not a big deal and really what you want is to allow different library maintainers to do it all differently and just build your own adapters over top of them.
This allows library developers to worry about narrow use cases and have their own separate processes to introduce breaking changes and features that deal with extreme specifics like GPU or TPU interop, heterogeneous distributed backed arrays, jagged arrays, etc. etc.
Let a thousand flowers blossom and a thousand different Python array and record APIs contend.
End users can write their own adapters to mollify irregularities between them, possibly writing different adapters on a case by case basis.
If any “standard” adapters gain popularity as open source projects, great - but don’t try to bake that in from an RFC point of view into the array & data structure libraries themselves. Let them be free / whatever they want to be. That diversity of API approach is super valuable and easily mollified by your own custom adapter logic.
Personally, I like how the Julia ecosystem coalesced around a Tables.jl package that gives some consistency to tabular data packages that choose to implement it. Row-oriented or column-oriented, doesn't matter! Using Tables.jl they can be interchangeable.
There’s a huge difference between “I personally like it when some libraries choose to adhere to a convention” vs “let’s bake this into every tool with a wide, bureaucratic shared RFC process.”
But I don't think people are forced into following right? There's still choice. But if I was developing something in Python, I would rather join than not join.
I’m saying as a consumer of these libraries, I don’t want joiners, I want lots of different APIs that make different trade-offs for each use case, and I will write (or use) adapters if I need to (that’s my responsibility as a consumer).
Just because I personally either do or do not get some value out of a shared API standard doesn’t mean you do or do not, so personal feelings of what one likes need to be set aside, so the service boundary is at the level of consumers choosing their own adapters and making their own trade-offs.
If lots of libraries sign up, we all lose. Much slower compliance-constrained development cycles, less freedom for libraries to break the shared API contract in ways many users would be super happy to abide for the sake of a faster / better feature. Instead of making these trade-offs library by library, case by case, it’s made for you with standardization compliance as the highest constraint, regardless of the user base for which that does / doesn’t work well.
This strikes me as a bit pessimistic. I think you are mostly against _bad_ standards than standards per se.
If the consortium does its job well it will produce a good standard that all the major libraries like: from a UX as well as a performance viewpoint.
If it produces a bad standard, all that will happen is that nobody will sign up. No libraries are going to be _forced_ to do anything.
That’s fair except I’d say even good standards can cause a big slowdown in feature releases, and not everyone values the adherence to the standard as much as the features. It still seems better to just spin out the standard into a set of optional adapters and let library maintainers and users pick their own trade-offs in terms of when to adopt breaking changes, how / whether to smooth over idiosyncrasies across multiple libraries.
Even a good standard has costs and this particular case does not seem like it has good arguments in favor of a wide standard, but many arguments against it.
No, it's easy for library maintainers to offer a compat API in addition to however else they feel they need to differentiate and optimize the interfaces for array operations. People can contribute such APIs directly to libraries once instead of creating many conditionals in every library-utilizing project or requiring yet another dependency on an adapter / facade package that's not kept in sync with the libraries it abstracts.
If a library chooses to implement a spec compatability API, they do that once (optimally, as compared with somebody's hackish adapter facade which has very little comprehension of each library's internals) and everyone else's code doesn't need to have conditionals.
Each of L libraries implements a compat API: O(L)
Each of U library utilizers implements conditionals for every N places arrays are utilized: O(U x N_)
Each of U library utilizers uses the common denominator compat API: O(U)
L < U < (L + U) < (U x N_)
Tech giants let the Web's metadata schemas and infrastructure languish
It's "langushing" and they should do it for us? It's flourishing and they're doing it for us and they have lots of open issues and I want more for free without any work.
Wow! Nobody else does anything to collaboratively, inclusively develop schema and the problem is that search engines aren't just doing it for us?
1) Search engines do not owe us anything. They are not obligated to dominate us or the schema that we may voluntarily decide to include on our pages.
We've paid them nothing. They have no contract for service or agreement with us which compels them to please us or contribute greater resources to an open standard that hundreds of people are contributing to.
2) You people don't know anything about linked data and structured data.
Here's a list of schema: https://lov.linkeddata.es/dataset/lov/ .
Here's the Linked Open Data Cloud: https://lod-cloud.net/
Does your or this publisher's domain include any linked data?
Does this article include any linked data?
Do data quality issues pervade promising, comparatively-expensive, redundant approaches to natural-language comprehension, reasoning, and summarization?
Here, in contributing this example PR adding RDFa to the codeforantarctica web page, I probably made a mistake. https://github.com/CodeForAntarctica/codeforantarctica.githu... . Can you spot the mistake?
There should have been review.
https://schema.org/ClaimReview, W3C Verifiable Claims / Credentials, ld-signatures, and lds-merkleproof2017.
Which brings us to reification, truth values, property graphs, and the new RDF* and SPARQL* and JSON-LD* (which don't yet have repos with ongoing issues to tend to).
3) Get to work. This article does nothing to teach people how to contribute to slow, collaborative schema standards work.
Here's the link to the GitHub Issues so that you can contribute to schema.org: https://github.com/schemaorg/schemaorg
...
"Standards should be better and they should pay for it"
Who are the major contributors to the (W3C) open standard in question?
Is telling them to put up more money or step down going to result in getting what we want? Why or why not?
Who would merge PRs and close issues?
Have you misunderstood the scope of the project? What do the editors of the schema feel in regards to more specific domain vocabularies? Is it feasible or even advisable to attempt to out-schema domain experts who know how to develop and revise an ontology or even just a vocabulary with Protegé?
To give you a sense of how much work goes into creating a few classes and properties defined with RDFS in RDFa in HTML: here's the https://schema.org/Course , https://schema.org/CourseInstance , and https://schema.org/EducationEvent issue: https://github.com/schemaorg/schemaorg/issues/195
Can you find the link to the Use Cases wiki (which was the real work)? What strategy did you use to find it?
...
"Well, Google just does what's good for Google."
Are you arguing that Google.org should make charitable contributions to this project? Is that an advisable or effective way to influence a W3C open standard (where conflicts of interest by people just donating time are disclosed)?
Anyone can use something like extruct or OSDS to extract RDFa, Microdata, and/or JSON-LD from a page.
Everyone can include structured data and linked data in their pages.
There are surveys quantifying how many people have included which types in their pages. Some of that data is included on schema.org types pages.
...
Some written interview questions:
> Which issues have you contributed to? Which issues have you seen all the way to closed? Have you contributed a pull request to the project? Have you published linked data? What is the URL to the docs which explain how to contribute resources? How would you improve them?
https://twitter.com/westurner/status/1291903926007209984
...
After all that's happened here, I think Dan (who built FOAF, which all profitable companies could use instead of https://schema.org/Person ) deserves a week off to add more linked data to the internet now please.
I think that might be fair, but when she makes org came out it pitched itself as trust us we will take care of things, so yeah they don’t owe us anything but track record matters for trust in future ventures by these search engine orgs
schemaorg/schemaorg/CONTRIBUTING.md https://github.com/schemaorg/schemaorg/blob/main/CONTRIBUTIN... explains how you and your organization can contribute resources to the Schema.org W3C project.
If you or your organization can justify contributing one or more people at full or part time due to ROI or goodwill, by all means start sending Pull Requests and/or commenting on Issues.
"Give us more for free or step down". Wow. What PRs have you contributed to justify such demands?
https://schema.org/docs/documents.html links to the releases.
Time-reversal of an unknown quantum state
T-symmetry https://en.wikipedia.org/wiki/T-symmetry > See also links to "reversible computing" but not the "time reversal" disambiguation page?
The "teeter-totter" thought experiment on that page is interesting to me in that it seems to illustrate how the future has more possibilities than the past. But it also occurs to me that the future possibilities are fractal with the past - ie. the toy could fall onto one of many other lower pedestals, teetering as it was before. In this way the "many arbitrary" possibilities could be perceived as anything but.
Electric cooker an easy, efficient way to sanitize N95 masks, study finds
An autoclave is what the electrical cooker is being used to emulate. A pressure cooker is a better option if you'll be on the road, camping, or otherwise without electricity. Researchers in Canada already proved several months ago that N95 masks could be sanitized this way:
Unfortunately the referenced NewsArticle does not link to the ScholarlyArticle https://schema.org/ScholarlyArticle :
"N95 Mask Decontamination using Standard Hospital Sterilization Technologies" (2020-04) https://www.medrxiv.org/content/10.1101/2020.04.05.20049346v... :
> We sought to test the ability of 4 different decontamination methods including autoclave treatment, ethylene oxide gassing, ionized hydrogen peroxide fogging and vaporized hydrogen peroxide exposure to decontaminate 4 different N95 masks of experimental contamination with SARS-CoV-2 or vesicular stomatitis virus as a surrogate. In addition, we sought to determine whether masks would tolerate repeated cycles of decontamination while maintaining structural and functional integrity. We found that one cycle of treatment with all modalities was effective in decontamination and was associated with no structural or functional deterioration. Vaporized hydrogen peroxide treatment was tolerated to at least 5 cycles by masks. Most notably, standard autoclave treatment was associated with no loss of structural or functional integrity to a minimum of 10 cycles for the 3 pleated mask models. The molded N95 mask however tolerated only 1 cycle. This last finding may be of particular use to institutions globally due to the virtually universal accessibility of autoclaves in health care settings.
The ScholarlyArticle referenced by and linked to by the OP NewsArticle is "Dry Heat as a Decontamination Method for N95 Respirator Reuse" (2020-07) https://pubs.acs.org/doi/full/10.1021/acs.estlett.0c00534 . Said article does not reference "N95 Mask Decontamination using Standard Hospital Sterilization Technologies" DOI: 10.1101/2020.04.05.20049346v2 . We would do well to record that (article A, seemsToConfirm, Article B) as third-party linked data (only if both articles do specifically test the efficacy of the given sterilization method with the COVID-19 coronavirus)
None of this is necessary for non-healthcare workers, since the virus will die off on a fabric mask in a few hours.
1) Not a single study has demonstrated that viable Sars-Cov-2 virus survives on porous materials in the real world
2) Even before Covid, it was known for decades (common medical knowledge) that human coronaviruses and flu viruses do not remain viable on porous materials for more than several hours.
That this common knowledge is not so common knowledge among the public is a failure of public health communication. The one or two alarmist studies showing that the virus "survives" X number of days don't reflect the real world because 1) the researchers literally directly douse or soak the surface with a huge viral load, and 2) the researchers usually only look for viral genetic material, not whether the virus can infect cells. Viability != detecting viral genetic material (same story for people - we shed viral genetic material long after we stop being contagious).
There is a reason the CDC has always said surface transmission is rare: https://www.nytimes.com/2020/05/22/health/cdc-coronavirus-to... there have been no documented cases of surface transmission: https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-si... Fortunately some in the media are catching on to the hygiene (security) theater: https://www.theatlantic.com/ideas/archive/2020/07/scourge-hy...
Citations for 1) and 2):
Most studies use unrealistic starting doses: https://www.thelancet.com/pdfs/journals/laninf/PIIS1473-3099...
Other human coronaviruses don't survive long on porous surfaces (gone by 6 hours): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7134510/. A hospital randomly collected patient and real surface swabs (including non-porous surfaces) for original SARS, a minority were PCR positive, none of the swabs were infectious (viral culture): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7134510/. Same story for flu (virus can be detected for days, but inactive and not viable after a few hours): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3222642/.
I could only find this study on Sars-Cov-2 that cultured the virus (still used a huge viral dose in a lab setting, not the real world). Even though they were able to culture the virus, only 1% of virus remains after 6 hours on a surgical mask, several orders of magnitude less for cotton clothing: https://www.medrxiv.org/content/10.1101/2020.05.07.20094805v...
This long post was originally meant to be a reply to someone asking for a citation; it's yet another example of how much effort is required to combat misinformation.
"Interim Recommendations for U.S. Households with Suspected or Confirmed Coronavirus Disease 2019 (COVID-19)" https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-si... :
> On the other hand, transmission of novel coronavirus to persons from surfaces contaminated with the virus has not been documented. Recent studies indicate that people who are infected but do not have symptoms likely also play a role in the spread of COVID-19. Transmission of coronavirus occurs much more commonly through respiratory droplets than through objects and surfaces, like doorknobs, countertops, keyboards, toys, etc. Current evidence suggests that SARS-CoV-2 may remain viable for hours to days on surfaces made from a variety of materials. Cleaning of visibly dirty surfaces followed by disinfection is a best practice measure for prevention of COVID-19 and other viral respiratory illnesses in households and community settings
Fed announces details of new interbank service to support instant payments
If anyone from the FED is reading this I have this advice: don't use ISO20022! (I've been working on the brazilian instant payments project, known as PIX). It's an overly-complicated-try-to-solve-everything standard that promises interoperability but only delivers pain and misery. Since each country has its own standards for person and account identification, in dealing with these local realities, the ISO20022 adds quite a lot of complexity. But all this cost does not result in interoperability. In fact, each payment scheme end up with a localized version of the standard, incompatible with others. What could have been a clean API, ends up a mess.
What alternative do you suggest? Keeping in mind all the parties that will be involved, besides your project.
Create a fit to purpose, simpler, protocol using a good interface specification language (OpenAPI or protocol buffers). Maybe borrow concepts and patterns from the ISO standard, but not adopt it. It will not deliver its promise of interoperability. It might actually make it harder to interoperate with other payment schemes.
Interledger Protocol (ILP, ILPv4).
Interledger Architecture:
https://interledger.org/rfcs/0001-interledger-architecture/#... :
> For purposes of Interledger, we call all settlement systems ledgers. These can include banks, blockchains, peer-to-peer payment schemes, automated clearing house (ACH), mobile money institutions, central-bank operated real-time gross settlement (RTGS) systems, and even more.
[...]
> Interledger provides for secure payments across multiple assets on different ledgers. The architecture consists of a conceptual model for interledger payments, a mechanism for securing payments, and a suite of protocols that implement this design.
> The Interledger Protocol (ILP) is the core of the Interledger protocol suite. Colloquially, the whole Interledger stack is sometimes referred to as "ILP". Technically, however, the Interledger Protocol is only one layer in the stack.
> Interledger is not a blockchain, a token, nor a central service. Interledger is a standard way of bridging financial systems. The Interledger architecture is heavily inspired by the Internet architecture described in RFC 1122, RFC 1123 and RFC 1009.
[...]
> You can envision the Interledger as a graph where the points are individual nodes and the edges are accounts between two parties. Parties with only one account can send or receive through the party on the other side of that account. Parties with two or more accounts are connectors, who can facilitate payments to or from anyone they're connected to.
> Connectors [AKA routers] provide a service of forwarding packets and relaying money, and they take on some risk when they do so. In exchange, connectors can charge fees and derive a profit from these services. In the open network of the Interledger, connectors are expected to compete among one another to offer the best balance of speed, reliability, coverage, and cost.
ILP > Peering, Clearing and Settling: https://interledger.org/rfcs/0032-peering-clearing-settlemen...
ILP > Simple Payment Setup Protocol (SPSP): https://interledger.org/rfcs/0009-simple-payment-setup-proto...
> This document describes the Simple Payment Setup Protocol (SPSP), a basic protocol for exchanging payment information between payee and payer to facilitate payment over Interledger. SPSP uses the STREAM transport protocol for condition generation and data encoding.
> (Introduction > Motivation) STREAM does not specify how payment details, such as the ILP address or shared secret, should be exchanged between the counterparties. SPSP is a minimal protocol that uses HTTPS for communicating these details.
[...]
GET /.well-known/pay HTTP/1.1
Host: example.com
Accept: application/spsp4+json, application/spsp+jsonShrinking deep learning’s carbon footprint
"Unlearning" is one algorithmic approach that may yield substantial energy consumption gains.
With many deep learning models, it's not possible to determine when or from what source something was learned: it's not possible to "back out" a change to the network and so the whole model has to be re-trained from scratch; which is O(n) instead of O(1.x).
The article covers software approaches (more energy-efficient algorithms) and mentions GPUs but not TPUs or ASICs.
Specialized chips (built with dynamic fabrication capacities) are far more energy efficient for specific types of workloads. We see this with mining ASICs, SSL accelerators, and also with Tensor Processing Units (for deep learning).
The externalities of energy production are the ultimate concern. If you're using cheap, clean energy with minimized external costs ("sustainable energy"), the energy-efficiency of the algorithm and the chips is of much less concern.
Could we recognize products, services, and data centers that were produced with and/or run on directly sourced clean energy as "200% Green"; with a logo on the box and/or the footer? 100% offset by PPAs is certainly progress.
Show HN: Starboard – Fully in-browser literate notebooks like Jupyter Notebook
Hi HN, I developed Starboard over the past months.
Cell-by-cell notebooks like Jupyter are great for prototyping, explaining and exploration, but their dependence on a Python server (with often undocumented dependencies) limits their ability to be shared and remixed. Now that browsers support dynamic imports, it has become possible to create a similar workflow entirely in the browser.
That motivated me to build Starboard Notebook, a tool I wished existed. It's:
* Run entirely in the browser, there is no server or setup, it's all static files.
* Web-native, so no widget system is necessary. There is nearly no magic, it's all web tech (HTML, CSS, JS).
* Stores as a plaintext file, it will play nicely with version control systems.
* Hackable: the sandbox that your code gets run in contains the editor itself, so you can metaprogram the editor itself (e.g. adding support for other languages such as Python through WASM).
* Open source (https://github.com/gzuidhof/starboard-notebook).
You can import any code that targets the browser directly (e.g. puts stuff on the window object), or that has exports in ES module format.
I'm happy to answer any questions!
Neat! There's a project called Jyve that compiles Jupyter Lab to WASM (using iodide). https://github.com/deathbeds/jyve There are kernels for JS, CoffeeScript, Brython, TypeScript, and P5. FWIU, the kernels are marked as unsafe because, unfortunately, there seems to be no good way to sandbox user-supplied notebook code from the application instance. The README describes some of the vulnerabilities that this entails.
The jyve project issues discuss various ideas for repacking Python packages beyond the set already included with Pyodide and supporting loading modules from remote sources.
https://developer.mozilla.org/en-US/docs/Web/Security/Subres... : "Subresource Integrity (SRI) is a security feature that enables browsers to verify that resources they fetch (for example, from a CDN) are delivered without unexpected manipulation. It works by allowing you to provide a cryptographic hash that a fetched resource must match."
There's a new Native Filesystem API: "The new Native File System API allows web apps to read or save changes directly to files and folders on the user's device." https://web.dev/native-file-system/
We'll need a way to grant specific URLs specific, limited amounts of storage.
https://github.com/iodide-project/pyodide :
> The Python scientific stack, compiled to WebAssembly
> [...] Pyodide brings the Python 3.8 runtime to the browser via WebAssembly, along with the Python scientific stack including NumPy, Pandas, Matplotlib, parts of SciPy, and NetworkX. The packages directory lists over 35 packages which are currently available.
> Pyodide provides transparent conversion of objects between Javascript and Python. When used inside a browser, Python has full access to the Web APIs.
https://github.com/deathbeds/jyve/issues/46 :
> Would miniforge and conda-forge build a WASM architecture target?
> Emscripten or WASI?
Ask HN: Learning about distributed systems?
I used to love Operating Systems during my undergrads, Modern Operating Systems by Tanenbaum is till date the only academic book I've read entirely. I recently read an article about how Amazon built Aurora by Werner Vogels and I was captivated by it. I want to start reading about Distributed Systems. What would be a good start/Road Map?
The book “Designing Data-Intensive Applications” by Martin Kleppman is a fantastic read with such a concise train of thought. It builds up from basics, adds another thing, and another thing.
I kept asking myself, what would happen if I were to extend on the feature currently presented in the chapter I was reading, only to find out my answers in the next chapter.
Brilliant book
> "Designing Data-Intensive Applications” by Martin Kleppman: https://dataintensive.net/ https://g.co/kgs/xJ73FS
From a previous question re: "Ask HN: CS papers for software architecture and design?" (https://news.ycombinator.com/item?id=15778396 and distributed systems we eventually realize were needed in the first place:
> Bulk Synchronous Parallel: https://en.wikipedia.org/wiki/Bulk_synchronous_parallel .
Many/most (?) distributed systems can be described in terms of BSP primitives.
> Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science) .
> Raft: https://en.wikipedia.org/wiki/Raft_(computer_science) #Safety
> CAP theorem: https://en.wikipedia.org/wiki/CAP_theorem .
Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...
awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems
- Byzantine fault: https://en.wikipedia.org/wiki/Byzantine_fault :
> A [Byzantine fault] is a condition of a computer system, particularly distributed computing systems, where components may fail and there is imperfect information on whether a component has failed. The term takes its name from an allegory, the "Byzantine Generals Problem",[2] developed to describe a situation in which, in order to avoid catastrophic failure of the system, the system's actors must agree on a concerted strategy, but some of these actors are unreliable.
awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata
Practically, dask.distributed (joblib -> SLURM,), dask ML, dask-labextension (a JupyterLab extension for dask), and the Rapids.ai tools (e.g. cuDF) scale from one to many nodes.
Not without a sense of irony, as the lists above list many papers that could be readings with quizzes,
Distributed systems -> Distributed computing: https://en.wikipedia.org/wiki/Distributed_computing
Category: Distributed computing: https://en.wikipedia.org/wiki/Category:Distributed_computing
Category:Distributed_computing_architecture : https://en.wikipedia.org/wiki/Category:Distributed_computing...
DLT: Distributed Ledger Technology: https://en.wikipedia.org/wiki/Distributed_ledger
Consensus (computer science) https://en.wikipedia.org/wiki/Consensus_(computer_science)
Ask HN: How can I “work-out” critical thinking skills as I age?
As I get older, I realized I’m not as sharp as I used to be. Maybe it’s from the fatigue of juggling 2 kids, but I’m very ill prepared for interviews because I simply can’t answer “product questions” and brain teasers. It’s a skill I need, and truthfully I was never good at consultant type questions to begin with but I’m seeing a lot of these questions in Data Science interviews.
Any help or resources will be tremendously appreciated.
Problem solving: https://en.wikipedia.org/wiki/Problem_solving
Critical thinking: https://en.wikipedia.org/wiki/Critical_thinking
Computational Thinking: https://en.wikipedia.org/wiki/Computational_thinking
> 1. Problem formulation (abstraction);
> 2. Solution expression (automation);
> 3. Solution execution and evaluation (analyses).
Interviewers may be more interested in demonstrating problem solving methods and f thinking aloud than an actual solution in an anxiety-producing scenario.
https://en.wikipedia.org/wiki/Brilliant_(website) ;
> Brilliant offers guided problem-solving based courses in math, science, and engineering, based on National Science Foundation research supporting active learning.[14]
Coding Interview University: https://github.com/jwasham/coding-interview-university
Programmer Competency Matrix: https://github.com/hltbra/programmer-competency-checklist
Inference > See also: https://en.wikipedia.org/wiki/Inference
- Deductive reasoning: https://en.wikipedia.org/wiki/Deductive_reasoning
- Inductive reasoning: https://en.wikipedia.org/wiki/Inductive_reasoning
> This is the [open] textbook for the Foundations of Data Science class at UC Berkeley: "Computational and Inferential Thinking: The Foundations of Data Science" http://inferentialthinking.com/
The tragedy of FireWire: Collaborative tech torpedoed by corporations
Due to DMA (Direct Memory Access) in most implementations, IEEE 1394 ("FireWire") can be used to directly read from and write to RAM.
See: IEEE 1394 > Security issues https://en.wikipedia.org/wiki/IEEE_1394#Security_issues
FWIU, USB 3 is faster than FireWire; there are standard, interchangeable USB connectors and adapters; and USB implementations do not use DMA. https://en.wikipedia.org/wiki/USB_3.0
> FWIU, USB 3 is faster than FireWire
You've missed the point. The title of the article reads "The tragedy of FireWire" - how a good standard was victimized due to corporate politics. FireWire was 10 years ahead of USB. In 2002, FireWire 800 provided a data rate of 800 Mbps, full-duplex. On the other hand, USB 2 only had 480 Mbps, half-duplex. USB didn't catch up until 2010, after USB 3.0 was released. FireWire uses 30 V, 1.5 A power, and it supports power delivery up to 45 watts. USB didn't offer anything similar, not even the latest USB 3, until 2014 after Power Delivery and Type-C's standardization - and real applications only started to ramp up by ~2017. Both limitations of USB 2 were detrimental to some applications, e.g. using hard drives on a PC was painful due to speed and power constraints, had FireWire became more common, it would not be an issue. FireWire can also do everything Ethernet can do (in fact, 1394b's signaling is better than 100Base-TX Ethernet, and has lower radiation), networking is natively supported, you can network two computers together directly with a 1394 cable! On USB, it's only possible using an external controller. One application was NAS, a hard drive can be shared directly over 1394, and mounted as a local drive. Not possible with USB. Finally, 1394b does not have USB's 5-meter limitation due to protocol timing, and natively supports signaling over CAT-5 and fiber optics cable for a distance of ~50-100 meters - on USB, it requires a media converter running vendor-specific custom protocols and not interoperable. All of these features are standardized in 2004 in IEEE 1394b.
Of course, limitations are not inherently USB 2's failures, they have different applications, FireWire was more expensive, and USB's goal was low-cost. But the tragedy was that bad business decision prevented FireWire to achieve its full potential, slowly fading out, and eventually became irrelevant after USB 3 despite its initial good engineering.
FireWire was a tragedy.
> Due to DMA (Direct Memory Access) in most implementations, IEEE 1394 ("FireWire") can be used to directly read from and write to RAM.
This is why we need IOMMU.
You should not blame IEEE 1394 for DMA attacks just because it supports DMA. I agree that the tragedy of FireWire probably prevented a common PC exploit vector as an lucky unintentional consequence, but it's just a symptom, not the actual issue - and the symptom has came back in another form.
Most sufficiently fast hardware interfaces support DMA, and they're equally vulnerable - this includes PCI-E, ExpressCard (which exposes PCI-E), USB 4 (which exposes PCI-E port), Thunderbolt (which exposes PCI-E). Practical exploits are already widespread, not limited to arbitrary memory reads/writes, but also OptionROM arbitrary code execution if a Thunderbolt device is plugged in at boot time.
And this is not only a problem for external ports like 1394, USB 4 or Thunderbolt. Threats also exist from internal devices, like a Ethernet or Wi-Fi controller on the motherboard. While you cannot initiate a DMA transaction via Ethernet (without RDMA) or Wi-Fi since they don't expose PCI-E, the fact that those are connected directly to PCI-E implies any vulnerability in the controller firmware potentially allows an attacker to launch a DMA attack. Exploits already exist for Wi-Fi controllers.
This is the real problem. While the mechanism of IOMMU exists, CPU manufacturers and operating systems previously did little to systematically protect the system from DMA. In the past, Intel intentionally removed IOMMU functionality from low-end CPUs and motherboard chipsets in order to sell more Xeon CPUs to enterprise users. Also, while IOMMU is supported by operating systems, kernel code and drivers were not systematic audited for security - serious IOMMU bypass vulnerabilities have been previously discovered (and patched) in macOS and Linux's IOMMU implementations.
With continued proliferation of USB 4 and Thunderbolt, the lack of IOMMU and buggy driver implementation will become a bigger problem.
So far, QubesOS is most most protected operating system from DMA attacks. Untrusted hardware are assigned to a dedicated VM, and IOMMU is enforced by the hypervisor, not the individual device drivers in operating systems. Compromising a device driver in an untrusted domain does not directly expose the rest of QubesOS to attackers.
> and USB implementations do not use DMA.
USB 4 does now.
So your argument is that not security but cost is the reason that USB "won" the external device interface competition with FireWire?
Good to know that USB4 implementations are making the same mistake as FireWire implementors did in choosing performance over security . Unfortunately it looks like there will be no alternative except for maybe to use a USB3 hub (or an OS with fuzzed IOMMU and also controller firmwares)?
Could an NX bit for data coming from buses with and without DMA help at all?
Hot gluing external ports now seems a bit more rational and justified for systems where physical access is less controlled.
> So your argument is that not security, but cost is the reason that USB "won" the external device interface competition with FireWire?
Sorry, but I heavily suspect that you didn't read the original article and I suggest you to do so.
Security has nothing to do with 1394's market adoption, in the 2010s, some desktops still have 1394, and some laptops still have ExpressCard, and you don't see anyone complains about security, only some security researchers.
The reason was partially cost, but also a bad business decision by Steve Jobs, which made Intel and Microsoft stopped their efforts for pushing 1394, largely damaged it before it even got a chance for wider adoption.
> FireWire's principal creator, Apple, nearly killed it before it could appear in a single device. And eventually the Cupertino company effectively did kill FireWire, just as it seemed poised to dominate the industry.
> Intel sent its CTO to talk to Jobs about the change, but the meeting went badly. Intel decided to withdraw its support for FireWire—to pull the plug on efforts to build FireWire into its chipsets—and instead throw its weight behind USB 2.0.
> Sirkin believes that Microsoft could have reversed the new licensing policy by citing the prior signed agreement. "Microsoft must have thrown it away," he speculated, because it would have "stopped Apple in its tracks."*
--
> Good to know that USB4 implementations are making the same mistake as FireWire implementors did in choosing performance over security.
DMA is optional and can be disabled.
Speaking of usability, DMA is limited to the Thunderbolt part (which is ultimately the PCI-E part) of USB 4 standard. You don't have to use USB 4's PCI-E. Unless the USB device really needs PCI-E, like RAID storage box or a GPU, disabling DMA is not a problem. Requiring explicit user consent for PCI-E is also a solution, operating systems already does it for Thunderbolt devices. Same treatment can be applied to USB 4.
And again, calling out DMA support is only shooting the messenger while ignoring the underlying problem. DMA is essential for any high-performance system bus, and it can a problem regardless of who supported it, or whether it's connected to an external port. A Wi-Fi controller handles untrusted data, but despite being an internal device on the motherboard, it's still potentially an exploit vector for DMA attackers.
> (or an OS with fuzzed IOMMU and also controller firmwares)
Only IOMMU is needed (and possibly a good driver, but not strictly required, see QubesOS). If your IOMMU is good, you don't have to trust firmware or hardware, since arbitrary DMAs are blocked by IOMMU, just like how memory protection (MMU) works but for I/O. Memory spaces are isolated.
The IOMMU vulnerabilities I previous mentioned are already patched, and a complete bypass of IOMMU is unlikely to occur in the future. Driver bugs are still an issue, more RAM regions can be exposed by drivers than what's really needed, and potentially you can pretend to be any hardware and feed bad data to trick the most vulnerable driver to expose more RAM regions, and I expect to see more bugs.
Turn on IOMMU, and apply OS patches often, and you'll probably be fine.
I read much of the article (which assumed that "FireWire" failed because of issues with suppliers failing to work together instead of waning demand (due in part to corporate customers' knowledge of the security risks of most implementations)).
Thanks for the info on USB-4, DMA, IOMMU.
IOMMU: https://en.wikipedia.org/wiki/Input%E2%80%93output_memory_ma...
Looks like there are a number of iommu Linux kernel parameters: https://www.kernel.org/doc/html/latest/admin-guide/kernel-pa...
Wonder what the defaults are and what the comparable parameters are for common consumer OSes.
Looks like NX bit support is optional in IOMMUs.
Can I configure the amount of RAM allocated to this?
> Wonder what the defaults
See the Thunderclap FAQ, most of your questions are explained. Thunderclap is the DMA attack on Thunderbolt, and everything should be applicable to USB 4 as well, since USB 4 incorporated Thunderbolt directly.
From the FAQ,
> In macOS 10.12.4 and later, Apple addressed the specific network card vulnerability we used to achieve a root shell. However the general scope of our work still applies; in particular that Thunderbolt devices have access to all network traffic and sometimes keystrokes and framebuffer data.
> Microsoft have enabled support for the IOMMU for Thunderbolt devices in Windows 10 version 1803, which shipped in 2018. Earlier hardware upgraded to 1803 requires a firmware update from the vendor. This brings them into line with the baseline for our work, however the more complex vulnerabilities we describe remain relevant.
> Recently, Intel have contributed patches to version 5.0 of the Linux kernel (shortly to be released) that enable the IOMMU for Thunderbolt and prevent the protection-bypass vulnerability that uses the ATS feature of PCI Express.
But note that Microsoft says on its website,
> This feature does not protect against DMA attacks via 1394/FireWire, PCMCIA, CardBus, ExpressCard, and so on.
Basically, IOMMU is enabled by default in most systems now. However, only on Thunderbolt devices (and USB 4 devices, since Thunderbolt is now its subset). But not on other PCI-E ports, such as the internal PCI-E ports, or external ones like 1394 or ExpressCard. I think it's due to fears of performance and compatibility issues, there's no reason why it cannot be implemented if one insists. Also, from the macOS example, you see that bad drivers can be a problem, even with IOMMU, if a driver is bad, attackers can exploit them and trick them to give more RAM access (If my memory is correct, Linux's implementation was better, but since it's an early FAQ, I don't know if macOS has improved).
> what the comparable parameters are for common consumer OSes.
On the Linux kernel, with an Intel CPU, the kernel option "intel_iommu=on" will enable IOMMU completely for everything, internal and external, which is the most ideal option. You should see
DMAR: IOMMU enabled
in dmesg. Some people report compatibility issues for Intel's integrated GPU that causes glitches or system hang, use "intel_iommu=on,igfx_off" should fix it (which is not really a security problem, the GPU is already in the CPU). I've been running my PC with IOMMU for years by now. didn't notice any problem for me.IOMMU needs support from the CPU and motherboard chipset, Intel's IOMMU is called VT-D and marketed it as I/O virtualization for virtual machines. Most i3, i5, and i7 CPUs are supported. You may need to turn it on in BIOS if there's an option. It should work on most laptops, even an old Ivy Bridge laptop should work - which is good news, laptops are the most vulnerable platform. Unfortunately, on desktops, Intel intentionally removed IOMMU from high-end CPUs preferred by PC enthusiasts in the 1st (Nehalem), 2nd (Sandy Bridge), 3rd (Ivy Bridge), and 4th (Haswell) generations, possibly as a strategy to sell more Xeon chips to enterprise virtualization users (likely in line with their cripple-ECC strategy). Most high-end overclockable K-series CPUs have their IOMMU removed, while the non-K counterparts are supported. Also, in these generations, some motherboard chipsets are crippled and don't have IOMMU. Fortunately, Intel no longer does it since 5/6th gen (Broadwell/Skylake) CPUs.
No experience on AMD yet, but it should be similar. I think possibly better, my impression is that AMD doesn't intentional crippling IOMMU or ECC like Intel does (a nasty strategy...). Check the docs.
The Developer’s Guide to Audit Logs / SIEM
This article suggests that there should be separate data collection systems for: analytics, SIEM logs, and performance metrics.
The article mentions the CEF (Common Event Format) standard but not syslog or GELF or other JSON formats.
[ArcSight] Common Event Format [PDF]: https://kc.mcafee.com/resources/sites/MCAFEE/content/live/CO...
GELF: Graylog Extended Log Format: https://docs.graylog.org/en/latest/pages/gelf.html
Wikipedia > Syslog lists a few limitations of Syslog (no message delivery confirmation, though there is a reliable delivery RFC; and insufficient payload standardization) and also links to the existing Syslog RFCs. https://en.wikipedia.org/wiki/Syslog
Are push-style systems ideal for security logshipping systems? What sort of a message broker is ideal? AMQP has reliable delivery; while, for example, ZeroMQ does not and will drop messages due to resource exhaustion.
Developers simply need an API for their particular framework to non-blockingly queue and then log structs to a remote server. This typically means moving beyond a single-threaded application architecture so that the singular main [green] thread is not blocked when the remote log server is not responding.
SIEM: Security information and event management: https://en.wikipedia.org/wiki/Security_information_and_event...
Del.icio.us
The Firefox (and Chromium) bookmarks storage and sync systems still don't persist tags!
"Allow reading and writing bookmark tags" https://bugzilla.mozilla.org/show_bug.cgi?id=1225916
Notes re: how this could be standardized with JSON-LD: https://bugzilla.mozilla.org/show_bug.cgi?id=1225916#c116
The existing Web Experiment for persisting bookmark tags: https://github.com/azappella/webextension-experiment-tags/bl...
Ask HN: Recommendations for Books on Writing?
I want to propose a book club for writing as an engineer. Writing is fundamentally and critically important, but it seems that we don't emphasize it as much as we should for engineers (outside Amazon, where apparently it is a prominent member of the leadership pantheon).
I'm interested in any suggestions that HN has for great books on writing as an engineer! Accessibility and ease are important factors for a book club as well.
Technical Writing: https://en.wikipedia.org/wiki/Technical_writing
Google Technical Writing courses (1 & 2) and resources: https://developers.google.com/tech-writing :
- Google developer documentation style guide: https://developers.google.com/style
- Microsoft Writing Style Guide: https://docs.microsoft.com/en-us/style-guide/welcome/
Season of Docs is a program where applicants write documentation for open source projects: https://developers.google.com/season-of-docs/
Many open source projects are happy to accept necessary contributions of docs and editing; but do keep in mind that maintaining narrative documentation can be far more burdensome than maintaining API documentation that's kept next to the actual code. Systems like doxygen, epidoc, javadoc, and sphinx-apidoc enable developers to generate API documentation for a particular version of the software project as one or more HTML pages.
ReadTheDocs builds documentation from ReStructuredText and now also Markdown sources using Sphinx and the ReadTheDocs Docker image. ReadTheDocs organizes docs with URLs of the form <projectname>.rtfd.io/<language>/<version|latest>: https://docs.readthedocs.io/en/latest/ . The ReadTheDocs URL scheme reduces the prevalence of broken external links to documentation; though authors are indeed free to delete and rename docs pages and change which VCS tags are archived with RTD.
Write the Docs is a conference for technical documentation authors which is supported in part by ReadTheDocs: https://www.writethedocs.org/
Write the Docs > Learning Resources > All our videos and articles: https://www.writethedocs.org/topics/ :
> This page links to the topics that have been covered by conference talks or in the newsletter.
You might say that UX (User Experience) includes UI design and marketing: the objective is to imagine yourself as a customer experiencing the product or service afresh.
Writing dialogue is an activity we often associate more with creative writing exercises; where the objective is to meditate upon compassion for others.
One must imagine themself as ones/people/persons who interact with the team.
Cognitive walkthrough: https://en.wikipedia.org/wiki/Cognitive_walkthrough
The William Golding, Jung, and Joseph Campbell books on screenwriting, archetypes, and the hero's journey monomyth are excellent; if you're looking for creative writing resources.
Ask HN: How did you learn x86-64 assembly?
I'm an experienced C/C++ programmer and I occasionally look at the generated assembly to check for optimizations, loop unrolling, vectorization, etc. I understand what's going on the surface level, but I have a hard time understand what's going on in detail, especially with high optimization levels, where the compiler would do all kinds of clever tricks. I experiment with code in godbolt.org and look up the various opcodes, but I would like to take a more structured way of learning x86-64 assembly, especially when it comes to common patterns, tips and tricks, etc.
Are there any good books or tutorials you can recommend which go beyond the very beginner level?
High Level Assembly (HLA) https://en.wikipedia.org/wiki/High_Level_Assembly
> HLA was originally conceived as a tool to teach assembly language programming at the college-university level. The goal is to leverage students' existing programming knowledge when learning assembly language to get them up to speed as fast as possible. Most students taking an assembly language programming course have already been introduced to high-level control flow structures, such as IF, WHILE, FOR, etc. HLA allows students to immediately apply that programming knowledge to assembly language coding early in their course, allowing them to master other prerequisite subjects in assembly before learning how to code low-level forms of these control structures. The book The Art of Assembly Language Programming by Randall Hyde uses HLA for this purpose
Web: https://plantation-productions.com/Webster/
Book: "The Art of Assembly Language Programming" https://plantation-productions.com/Webster/www.artofasm.com/
Portable, Opensource, IA-32, Standard Library: https://sourceforge.net/projects/hla-stdlib/
"12.4 Programming in C/C++ and HLA" in the Linux 32 bit edition: https://plantation-productions.com/Webster/www.artofasm.com/...
... A chapter(s) about wider registers, WASM, and LLVM bitcode etc might be useful?
... Many awesome lists link to OllyDbg and other great resources for ASM; like such as ghidra: https://www.google.com/search?q=ollydbg+site%3Agithub.com+in...
Brain connectivity levels are equal in all mammals, including humans: study
This is outrageously misleading. To make a claim about the number of synapses between neurons based on MRI data is completely unwarranted. Voxel size (single volumetric pixel) in MRI is approximately 1mm, while synapse size is way less than a micron. You need resolution per pixel on the order of 10s of nanometers to identify synapses.
I wouldn’t be surprised if a dead salmon also has “equal” connectivity: https://www.wired.com/2009/09/fmrisalmon/
Surely there's something to learn here though. I haven't read the original paper but a quantity that's preserved across brain scales is either an artifact or a neat insight.
Your criticism reads like someone accusing economists of being outrageously misleading when they don't sample individual households but measure macro indicators. It's like saying Ramon y cajal was ridiculous because he couldn't image the neuropil effectively. Or like saying early optogenetics experiments were ridiculous because who knows if you're stimulating a neuron in a realistic manner?
And in any case, it's true that synapses are comically small relative to voxel size, but we also have some reasonable information about projection patterns and synapse number from various tracer or rabies studies with which you are no doubt familiar.
I haven't read the nature paper the press release is about and I'm not a huge fan of many d/fMRI practices or derived claims. And I've worked with enough mammalian dwi data to be skeptical of specific connection claims. But this strikes me as a rather interesting result even if you can't measure all the synapses at the right resolution: either the tractography method has connectivity conservation artifacts baked in, or there's something interesting going on.
"fNIRS Compared with other neuroimaging techniques" https://en.wikipedia.org/wiki/Functional_near-infrared_spect...
> When comparing and contrasting these devices it is important to look at the temporal resolution, spatial resolution, and the degree of immobility.
I think I missed your overall point?
OP suggests that the spatial resolution of existing MRI neuroimaging capabilities is insufficient to observe or so characterize or so generalize about neuronal activity in mammalian species. fNIRS (functional near-infrared spectroscopy) is one alternative neuroimaging capability that we could compare fMRI with according to the criteria for comparison suggested in the cited Wikipedia article: "temporal resolution, spatial resolution, and the degree of immobility".
Ask HN: Resources to start learning about quantum computing?
Hi there,
I'm an experienced software engineer (+15 years dev experience, MsC in Computer Science) and quantum computing is the first thing in my experience that is being hard to grasp/understand. I'd love to fix that ;)
What resources would you recommend to start learning about quantum computing?
Ideally resources that touch both the theoretical base and evolve to more practical usages.
"What are some good resources to learn about Quantum Computing?" https://news.ycombinator.com/item?id=16052193 https://westurner.github.io/hnlog/#comment-16052193
Launch HN: Charityvest (YC S20) – Employee charitable funds and gift matching
Stephen, Jon, and Ashby here, the co-founders of Charityvest (https://charityvest.org). We created a modern, simple, and affordable way for companies to include charitable giving in their suite of employee benefits.
We give employees their own tax-deductible charitable giving fund, like an “HSA for Charity.” They can make contributions into their fund and, from their fund, support any of the 1.4M charities in the US, all on one tax receipt.
Using the funds, we enable companies to operate gift matching programs that run on autopilot. Each donation to a charity from an employee is matched automatically by the company in our system.
A company can set up a matching gift program and launch giving funds to employees in about 10 minutes of work.
Historically, corporate charitable giving matching programs have been administratively painful to operate. Making payments to charities, maintaining tax records, and doing due diligence on charitable compliance is taxing on HR / finance teams. The necessary software to help has historically been quite expensive and not very useful for employees beyond the matching features.
This is one example of an observation Stephen made after working for years as a philanthropic consultant. Consumer fintech products aren’t built to make great giving experiences for donors. Instead, they are built for buyers — e.g., nonprofits (fundraising) or corporations (gift matching) — without a ton of consideration for the everyday user experience.
A few years back, my wife and I made a commitment to give a portion of our income away every year, and we found it administratively painful to give regularly. The tech that nonprofits typically use hardly inspires generosity — e.g., high fees, poor user flows, and questionable information flow (like tax receipts). Giving platforms try to compensate for poor functionality with bright pictures of happy kids in developing countries, but when the technology is not a good financial experience it puts a damper on things.
Charityvest started when I noticed a particular opportunity with donor-advised funds, which are tax-deductible giving funds recognized by the IRS. They are growing quickly (20% CAGR), but mainly among the high-net worth demographic. We believe they are powerful tools. They enable donors to have a giving portfolio all from one place (on one tax receipt) and have full control over their payment information/frequency, etc. Most of all, they enable a donor to split the decisions of committing to give and supporting a specific organization. Excitement about each of these decisions often strikes at different times for donors—particularly those who desire to give on a budget.
We believe everyone should have their own charitable giving fund no matter their net worth. We’ve created technology that has democratized donor-advised funds.
We also believe good technology should be available for every company, big and small. Employers can offer Charityvest for $2.49 / employee / month subscription, and we charge no fees on any of the giving — charities receive 100% of the money given.
Lastly, we send the program administrator a fun report every month to let them know all the awesome giving their company and its employees did in one dashboard. This info can be leveraged for internal culture or external brand building.
We’re just launching our workplace giving product, but we’ve already built a good portfolio of trusted customers, including Eric Ries’ (author of The Lean Startup) company, LTSE. We’ve particularly seen a number of companies use us as a meaningful part of their corporate decision to join the fight for racial justice in substantive ways.
Our endgame is that the world becomes more generous, starting with the culture of every company. We believe giving is fundamentally good and we want to build technology that encourages more of it by making it more simple and accessible.
You can check out our workplace giving product at (https://charityvest.org/workplace-giving). If you’re interested, we can get your company up and running in 10 minutes. Or, please feel free to forward us on to your HR leadership at your company.
Our giving funds are also available for free for any individual on https://charityvest.org — without gift matching and reporting. We’d invite you to check out the experience. For individuals, we make gifts of cash and stock to any charity fee-free.
Happy to share this with you all, and we’d love to know what you think.
What a great idea!
Are there two separate donations or does it add the company's name after the donor's name? Some way to notify recipients about the low cost of managing a charitable donation match program with your service would be great.
Have you encountered any charitable foundations which prefer to receive cryptoassets? Red Cross and UNICEF accept cryptocurrency donations for the children, for example.
Do you have integration with other onboarding and HR/benefits tools on your roadmap? As a potential employee, I would like to work for a place that matches charitable donations, so mentioning as much in job descriptions would be helpful.
Thanks! Our matching system issues an identical grant from the fund of the matching company. It goes out in the same grant cycle as the employee grant so they go together.
We haven't yet encountered any charity that prefers to receive cryptoassets.
We have lots of dreams about thoughtful integrations with HR software, but we want the experience to be excellent, and we want to keep the experience of our existing app excellent. We'll be balancing those priorities as we grow.
> Our matching system issues an identical grant from the fund of the matching company. It goes out in the same grant cycle as the employee grant so they go together.
So the system creates a separate transaction for the original and the matched donation with each donor's name on the respective gift?
How do users sync which elements of their HR information with your service? IDK what the monthly admin cost there is.
There are a few HR, benefits, contracts, and payroll YC companies with privacy regulation compliance and APIs https://www.ycombinator.com/companies/?query=Payroll
https://founderkit.com/people-and-recruiting/health-insuranc...
Separate data records, yes, separate payment, no. The charity receives one consolidated check in the mail each month across all donors (one payment), and the original donor's grant will be on the data sheet as well as the matching grant from the company's corporate fund (separate records).
Today, users create their accounts directly with our app, and they are affiliated with their corporation in our app via their email address. So we don't integrate with any HR information.
Administrators of the program can add and remove employees via copying and pasting email addresses (can add/remove many at a time). We aim to integrate with HR systems of record in the future to make this seamless.
Thanks for clarifying.
Do you offer a CSV containing donor information to the charity?
Do you support anonymous matched donations?
Can donors specify that a donation is strongly recommended for a specific effort?
...
3% * $1000/yr == $2.50/mo * 12mo
CSV: yes if they request we’ll happily provide.
Anonymous: yes it’s an option on our grant screen
Specific efforts: yes on the grant screen we enable donors to add a “specific need” they’d like their grant to fund.
:-)
It may be helpful to integrate with charity evaluation services to help donors assess various opportunities to give.
Charity Navigator > Evaluation method https://en.wikipedia.org/wiki/Charity_Navigator#Evaluation_m...
We love the idea of layering in rich information to help donors make informed decisions! We just want to be really thoughtful about the experience. So we plan to get there, it just may take us some time.
We Need a Yelp for Doctoral Programs
How are the data needs for such a doctoral and post-doctoral evaluation program different from the data needs for https://collegescorecard.ed.gov ?
Data: https://collegescorecard.ed.gov/data/
Data documentation: https://collegescorecard.ed.gov/data/documentation/
Reviews seem like the biggest need.
“The culture in the X department is terrible, everyone works 80 hours a week and is miserable”
“Night life is great at least, the grad students usually go out for student nights at blah blah after seminar”
“Underrated X program because at university Y, don’t be fooled we literally have 4 of the 5 top researchers in field Z”
Etc.
But it is so hard for people to give reviews. People seem to give good review when are they are very pissed off or very happy. Later seems to be a very rare case.
I have always thought that bad reviews are sufficient for all purposes, and the only reason to even have good reviews is so the reviewees don't feel overly persecuted.
Any product or service will have occasional customers who are either unreasonable or have bad luck. If you have a lot of bad reviews, then as a prospective customer you want to see if all of them fall under the two categories or whether there is a pattern that is relevant to you.
It seems to me that a review database works fine even when all the reviews are bad, and the only regulation it really needs is an effort to prevent one single individual with a vendetta from overwhelming it.
All of the World’s Money and Markets in One Visualization
> Derivatives top the list, estimated at $1 quadrillion or more in notional value according to a variety of unofficial sources.
1 Quadrillion: 1,000,000,000,000,000 (10^15)
Derivative (finance) https://en.wikipedia.org/wiki/Derivative_(finance)
Derivatives market https://en.wikipedia.org/wiki/Derivatives_market :
> The market can be divided into two, that for exchange-traded derivatives and that for over-the-counter derivatives.
Why companies lose their best innovators (2019)
https://news.ycombinator.com/item?id=23886158
> Three reasons companies lose their best innovators.
> 1. They fail to recognize and support the innovators
> 2. Innovation becomes a herculean task
> 3. Corporations don’t match rewards with outcomes
While the paragraphs under point 2 do discuss risk and the paragraphs under point 3 do discuss rewards, I'm not sure this article belongs here.
Risk and Reward.
Large corporations are able to pay people by doing things at scale; with sufficient margin at sufficient volume to justify continued investment. Risk is minimized by focusing on ROI.
Startups assume lots of risk and lots of debt and most don't make it. Liquidation preference applies as the startup team adjourns (and maybe open-sources what remains). In a large corporation, that burnt capital is reported to the board (which represents the shareholders) who aren't "gambling" per se. "You win some and you lose some" is across the street; and they don't have foosball and snacks.
How can large organizations (nonprofit, for-profit, governmental) foster intrapreneurial mindsets without just continuing to say "innovation" more times and expecting things to happen? Drink. "Innovators welcome!". Drink water.
"Intrapreneurial." What does that even mean? The employee, within their specialized department, spends resources (time, money, equipment) on something that their superior managers have not allocated funding for because they want: (a) recognition; (b) job security; (c) to save resources such as time and money; (d) to work on something else instead of this wasteful process; (e) more money.
Very few organizations have anything like "20% time". Why was 20% time thrown off the island to a new island where they had room to run? Do they have foosball? Or is the work so fun that they don't even need foosball? Or is it worth long days and nights because the potential return is enough money to retire tomorrow and then work on what?
Step 1. Steal innovators to work on our one thing
Step 2.
Step 3. Profit.
20% Project: https://en.wikipedia.org/wiki/20%25_Project
Intrapreneurship: https://en.wikipedia.org/wiki/Intrapreneurship
Internal entrepreneur: https://en.wikipedia.org/wiki/Internal_entrepreneur
CINO: Chief Innovation Officer / CTIO: Chief Technology Innovation Officer https://en.wikipedia.org/wiki/Chief_innovation_officer
... Is acquiring innovation and bringing it to scale a top-down process? How do we capture creative solutions and then allocate willing and available resources to making that happen?
awesome-ideation-tools: https://github.com/zazaalaza/awesome-ideation-tools
Powerful AI Can Now Be Trained on a Single Computer
Lots of people are focusing on this being done on a particularly powerful workstation, but the computer described seems to have power at a similar order of magnitude to the many servers which would be clustered together in a more traditional large ML computation. Either those industrial research departments could massively cut costs/increase output by just “magically keeping things in ram,” or these researchers have actually found a way to reduce the computational power that is necessary.
I find the efforts of modern academics to do ML research on relatively underpowered hardware by being more clever about it to be reminiscent of soviet researchers who, lacking anything like the access to computation of their American counterparts, were forced to be much more thorough and clever in their analysis of problems in the hope of making them tractable.
If anything it seems to me that doing the most work under constraint of resources is precisely what intelligence is about. I've always wondered why the consumption of compute resources is itself not treated as a significant part of the 'reward' in ML tasks.
At least if you're taking inspiration from biological systems, it clearly is part of the equation, a really important one even.
Ask HN: Something like Khan Academy but full curriculum for grade schoolers?
Khan Academy continually gets held up as a great resource for online courses across the age spectrum for math related subjects. With the continuing pandemic continuing to grow in the US and schools not really sure how to handle things, the GF and I are looking into other options.
Is there a recommended resource that gives unbiased (as possible) reviews for middle school (7-8th grade) curriculum? Searching these days really doesn't bring up quality, just options one has to comb through.
K12 CS Framework (ACM, Code.org, [...]) https://k12cs.org/
Computing Curricula 2020 (ACM, IEEE,) http://www.cc2020.net/
Official SAT Practice (College Board, Khan Academy) https://www.khanacademy.org/sat
http://wrdrd.github.io/docs/consulting/software-development#... (TODO: add link to cc2020 draft)
Programmer Competency Matrix: http://sijinjoseph.com/programmer-competency-matrix/ , https://competency-checklist.appspot.com/ , https://github.com/hltbra/programmer-competency-checklist
Re: Computational thinking https://westurner.github.io/hnlog/#comment-15454421
Coding Interview University: https://github.com/jwasham/coding-interview-university
AutoML-Zero: Evolving Code That Learns
They use evolutionary search [0] to create programs for image classification, specifically 'binary classification tasks extracted from CIFAR-10'. They do it from scratch, though they use a pytorch-ish programming language with differentiation baked in.
Looks like a nice summer intern project.
"AutoML-Zero: Evolving Machine Learning Algorithms From Scratch" (2020) https://arxiv.org/abs/2003.03384 https://scholar.google.com/scholar?cluster=11748751662887361...
How does this compare to MOSES (OpenCog/asmoses) or PLN? https://github.com/opencog/asmoses https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=%22... (2007)
SymPy - a Python library for symbolic mathematics
SymPy is the best!
In addition to the already-excellent official tutorial https://docs.sympy.org/latest/tutorial/index.html , I have written a short printable tutorial summary of the functions most useful for students https://minireference.com/static/tutorials/sympy_tutorial.pd...
For anyone interested in trying SymPy without installing, there is also the online shell https://live.sympy.org/ which works great and allows you to send entire computations as a URL, see for example this comment that links to an important linear algebra calculation: https://news.ycombinator.com/item?id=23158095
How does it compare to Matlab's symbolic toolbox, aside from not being affordable outside of academia?
NumPy for Matlab users: https://numpy.org/doc/stable/user/numpy-for-matlab-users.htm...
SymPy vs Matlab: https://github.com/sympy/sympy/wiki/SymPy-vs.-Matlab
If you then or later need to do distributed ML, it is advantageous to be working in Python. Dask Distributed, Dask-ML, RAPIDS.ai (CuDF), PyArrow, xeus-cling
Good to know: sympy does not use numpy so you can use pypy instead of python.
In my case I got a 7x speedup.
You can however use sympy to convert symbolic expressions to numeric numpy functions, using the "lambdify" feature. It's awesome because it lets you symbolically generate and manipulate a numerical program.
SymEngine https://github.com/symengine/symengine
> SymEngine is a standalone fast C++ symbolic manipulation library. Optional thin wrappers allow usage of the library from other languages, e.g.:
> [...] Python wrappers allow easy usage from Python and integration with SymPy and Sage (the symengine.py repository)
https://en.wikipedia.org/wiki/SymPy > Related Projects:
> SymEngine: a rewriting of SymPy's core in C++, in order to increase its performance. Work is currently in progress to make SymEngine the underlying engine of Sage too
How does it compare to sage math? I used sage math today and was pleasantly surprised. Even has a latex option. If you name your variables correctly, it even outputs Greek symbols and subscript correctly!
User 29athrowaway mentions in another comment that they use Sympy through Sagemath, so they probably could answer your question best. It sounds like Sympy is a subset of what Sage offers, but I'm not familiar enough with either tool to know.
Ask HN: Are there any messaging apps supporting Markdown?
I'd like to easily send formatted code, and bullet points, etc. through a messaging app without having to resort to a heavy app like Slack.
Mattermost supports CommonMark Markdown: https://docs.mattermost.com/help/messaging/formatting-text.h...
Zulip supports ~CommonMark Markdown: https://zulip.readthedocs.io/en/latest/subsystems/markdown.h...
Reddit supports Markdown. https://www.reddit.com/wiki/markdown
Discourse now supports CommonMark Markdown.
GitHub, BitBucket, GitLab and Gogs/Gitea support Markdown.
I wouldn't consider Discourse/Reddit/Github/etc. to be a messaging app per se, even if some of those have messaging functionality between users...
I digress on the category definition. Public messaging (without PM or DM features) is still messaging; and often far more useful than trying to forward 1:1 messages in order to bring additional participants onboard.
It's worth noting that GH/BB/GL have all foregone PM features; probably for the better in terms of productivity: messaging @all is likely more productive.
What vertical farming and ag startups don't understand about agriculture
My father is an ag soil chemist of 50+ years.
I'm an industrial systems eng. w/ a specialty in polymer-textile-fiber engineering. (Mostly useless skillsets in the US now)
Gonna share a few lessons here about agriculture that I try to convey to EECS, econ, Neuroscience, and the web developer crowd.
- You can only grow non-calorically dense foods in vertical farms
- It takes 10-14 kwh/1000 gallons of water to desalinate. More if it gets periodically polluted at an increasing rate.
- Large majority Agrarian populations exist because the countries are stuck in a purgatory of <1 MWh/capita annum whereby the country doesn't have scaleable nitrogen and steel manufacturing.
- Sweet potatoes and sweet potatoes are some of the highest satiety lowest input to output ratio produce. High efficiency.
- In civilizations where you are at < 1MWh/capita annum - there is not enough electricity to produce tools for farming, steel for roads, and concrete for building things. The end result is that the optimal decision is to have more children to harvest more calories per an acre.
- Property, bankruptcy, and inheritance law have an immense influence on the farmer population of a country.
I remember telling some "ag tech" VCs my insights and offering to introduce my father who has an immense amount of insight on the topic from having grown things for as long as he has....My thoughts were tossed aside.
So you can't grow potatoes vertically? Can you elaborate? Is it a function of physiology, i.e. calorie dense vegetables need far more leaves and supporting stems than can be practically stacked vertically?
I imagine space is a factor, but energy will be a big one as well. Calorie dense foods will likely need more space and energy (light) inputs. Vertical farms are very water efficient, so I don't think that matters much.
Vertical farms make a lot more sense with fresh vegetables like leafy greens that grow quickly, command high prices if grown organically, and benefit from being closer to market.
Potatoes are the exact opposite. If it ever becomes more cost effective to grow corn, wheat, and potatoes in virtual farms then outdoor agriculture is dead. While I don't agree with the article that it will never happen, it might require energy advances like fusion power or drastically higher _rural_ land values and water prices.
Greenhouses make sense long before vertical farming, just look at agriculture in the Netherlands, it's mind boggling how much they produce for such a tiny country.
Can you expand on this?
I get that to store a calorie in a potato I need to supply a calorie of energy from somewhere else.
But why is fusion power required instead of better UV lamps in my vertical farm? (Assuming I had enough electricity to run them)
The total amount of electricity to power those UV lamps should be on par with what the Sun sends to the potatoes fields. Maybe that's the reason for fusion. It didn't do the math.
Actually no not really. Plants only absorb two wavelengths of light. It's currently more efficient to convert sun into solar power via panels and then to light LEDs supplying only the wavelengths that plants use. Despite the seeming inefficiency here, the fact is that plants are even more inefficient at absorbing light not at the right wavelengths than solar panels.
Could one imagine a material that would absorb solar spectrum and emit the preferred frequencies? Something like a polymer one could stretch over fields to get more from the suns rays.
>> "Actually no not really. Plants only absorb two wavelengths of light. It's currently more efficient to convert sun into solar power via panels and then to light LEDs supplying only the wavelengths that plants use. Despite the seeming inefficiency here, the fact is that plants are even more inefficient at absorbing light not at the right wavelengths than solar panels."
> Could one imagine a material that would absorb solar spectrum and emit the preferred frequencies? Something like a polymer one could stretch over fields to get more from the suns rays.
Would you call that a "solar transmitter"?
https://en.wikipedia.org/wiki/Transmitter :
> Generators of radio waves for heating or industrial purposes, such as microwave ovens or diathermy equipment, are not usually called transmitters, even though they often have similar circuits.
Would "absorption spectroscopy" specialists have insight into whether this is possible without solar cells, energy storage, and UV LEDs? https://en.wikipedia.org/wiki/Absorption_spectroscopy
(edit) The thermal energy from sunlight (from the FREE radiation from the nuclear reaction at the center of our solar system) is also useful to and necessary for plants. There's probably a passive heat pipe / solar panel cooling solution that could harvest such heat for colder seasons and climates.
Also, UV-C is useful for sanitizing (UVGI) but not really for plant growth. https://en.wikipedia.org/wiki/Ultraviolet_germicidal_irradia... :
> UVGI can be coupled with a filtration system to sanitize air and water.
Is that necessary or desirable for plants?
https://www.lumigrow.com/learning-center/blogs/the-definitiv... :
> The light that plants predominately use for photosynthesis ranges from 400–700 nm. This range is referred to as Photosynthetically Active Radiation (PAR) and includes red, blue and green wavebands. Photomorphogenesis occurs in a wider range from approximately 260–780 nm and includes UV and far-red radiation.
Photomorphogenesis: https://en.wikipedia.org/wiki/Photomorphogenesis
PAR: Photosynthetically active radiation: https://en.wikipedia.org/wiki/Photosynthetically_active_radi...
Grow light: https://en.wikipedia.org/wiki/Grow_light
Are there bioluminescent e.g. algae which emit PAR and/or UV? Algae can feed off of waste industrial gases.
Bioluminescence > Light production: https://en.wikipedia.org/wiki/Bioluminescence#Light_producti...
Biophoton: https://en.wikipedia.org/wiki/Biophoton
Chemiluminescence: https://en.wikipedia.org/wiki/Chemiluminescence
Electrochemiluminescence: https://en.wikipedia.org/wiki/Electrochemiluminescence
Quantum dot display / "QLED": https://en.wikipedia.org/wiki/Quantum_dot_display
Could be possible? Analyzing the inputs and outputs is useful in natural systems, as well.
Ask HN: What are your go to SaaS products for startups/MVPs?
Looking for some inspiration. Ive done a lot of MVPs/Early-stage apps over the years and I tend to lean on the same SaaS portfolio for mails, text gateways, payment etc, but Im sure Ive missed a few valuable additions.
Here's a few I use: Mails: Mailchimp / Mandrill Payment: Paylike Search: Algolia
https://StackShare.io and https://FounderKit.com are great places to find reviews of SaaS services:
> mails,
https://founderkit.com/growth-marketing/email-marketing/revi...
https://stackshare.io/email-marketing
https://zapier.com/learn/email-marketing/
> text gateways,
https://founderkit.com/apis/sms/reviews
https://stackshare.io/voice-and-sms
> payments
https://founderkit.com/apis/credit-card-processing/reviews
https://stackshare.io/payment-services
Both have categories:
https://stackshare.io/categories
https://founderkit.com/reviews
If you're looking for market/competition research, I'd recommend https://coscout.com as well, for info about companies, funding, competitors, tech stack etc
It's in alpha right now, so you'll have to get in the waitlist, though they're only a few weeks away from launching AFAIK.
Full disclosure: A friend of mine is building this :)
Long term viability of SaaS solutions is definitely worth researching.
Is this something that's going to get acquired and be extinguished?
What are our switching costs?
How do we get our data in a format that can be: read into our data warehouse/lake and imported into an alternate service if necessary in the future?
How does Coscout compare to e.g. Crunchbase, PitchBook (Morningstar), YCharts, AngelList?
Ask HN: Do you read aloud or silently in your minds?
Most times while reading a new topic that I am not familiar with, I tend to read aloud in my mind. Yet that changes based on the content and the way it is written.
When I'm focused, I notice that reading silently helps increase my reading speed and cognition, like everything is flowing in.
Other times I don't seem to understand anything if I'm not reading it aloud in my mind.
Has anyone noticed such a thing and if so can you share any tips or information you've learned about this behavior
Subvocalization https://en.wikipedia.org/wiki/Subvocalization
Speed reading https://en.wikipedia.org/wiki/Speed_reading
Ask HN: How do you deploy a Django app in 2020?
Hi. I'm a mid-level software engineer trying to deploy a small (2000 max users) django app to production.
If I Google: How to deploy a Django app. I get 10+ different answers.
Can anyone on HN help me please.
You'll probably get 10 different answers here too.
I use uwsgi in emperor mode. I have a Makefile that runs a few SSH commands on production to clone my git repo at a specific commit, builds a virtual environment from requirements.txt, and atomically swaps a symlink to a uwsgi socket. Nginx in front of it all. It's pretty much the Python equivalent of FTPing HTML files to a web server.
Uwsgi is terrifying because it has so many options (and I wonder how secure it is), but it has never failed me. Packaging and containerizing things sounds cool, but I just can't justify spending time on it when my setup works fine (I'm a solo dev).
If you have only one production server, dokku is "A docker-powered PaaS that helps you build and manage the lifecycle of applications." Dokku supports Heroku buildpack deployment (buildstep), Procfiles, Dockerfile deployment, Docker image deployment, git deployment (gitreceive), or tarfile deployments. https://github.com/dokku/dokku
There are a number of plugins for Dokku. Dokku ships with the nginx plugin as the HTTP frontend proxy. Dokku supports SSL certs with the certs plugin.
When you need to move to more than one server, what do you do? There's now a dokku-scheduler-kubernetes plugin which can do HA (high availability) which is worth reading about before you develop and document your own deployment workflow. https://github.com/dokku/dokku-scheduler-kubernetes
I also always put build, test, and deployment commands in a Makefile.
Package it; as a container or as containers that install a RPM/DEB/APK/Condapkg/Pythonpkg (possibly containing a zipapp). Zipapps are fast.
If you have any non-python dependencies, a Pythonpkg only solves for part of the packaging needs.
Producing a packaged artifact should be easy and part of your CI build script.
Here's the cookiecutter-django production docker-compose.yml with containers for django, celery, postgres, redis, and traefik as a load balancer: https://github.com/pydanny/cookiecutter-django/blob/master/%...
Cookiecutter-django also includes a Procfile.
With k8s, you have an ingress (~load balancer + SSL termination proxy) other than traefik.
You can generate k8s YML from docker-compose.yml with Kompose.
I just found this which describes using GitLab CI with Helm: https://davidmburke.com/2020/01/24/deploy-django-with-helm-t...
What is the command to scale up or down? Do you need a geodistributed setup (on multiple providers' clouds)? Who has those credentials and experience?
How do you do red/green or rolling deployments?
Can you run tests in a copy of production?
Can you deploy when the tests that run on git commit pass?
What runs the database migrations in production; while users are using the site?
If something deletes the whole production setup or the bus factor is 1, how long does it take to redeploy from zero; and how much manual work does it take?
CI + Ansible + Terraform + Kubernetes.
Whatever tools you settle on, django-eviron for a 12 Factor App may be advisable. https://github.com/joke2k/django-environ
The Twelve-Factor App: https://12factor.net/
Containers from first principles
I recently discovered systemd-nspawn and was amazed at how lightweight a basic container can be.
"Docker Without Docker" (2015) explains /sbin/init and systemd-nspawn. Systemd did not exist when docker was first created. https://chimeracoder.github.io/docker-without-docker/
That's not what I remember. Wikipedia backs up that recollection as well [0], marking systemd's initial release in 2010. Docker's initial release is listed as 2013 [1]. Maybe that was true of the dotCloud internal releases, and certainly, not all of the EL and other Linux distros had not adopted systemd during 2010 - 2013. Certainly after 2014 or 2015, systemd had spread to the major Linux distros so Docker could have chosen to take a systemd-based approach at that point.
Are there other systemd + containers solutions?
"Chapter 4. Running containers as Systemd services with Podmam" https://access.redhat.com/documentation/en-us/red_hat_enterp...
AFAIU, when running containers with systemd:
- logs go to journald by default
- there's no docker-compose for just the [name-prefixed] containers in the docker-compose.yml,
- you can use systemd unit template parametrization
- it's not as easy to collect metrics on every container on the system without a read-only docker socket: how many containers are running, how much RAM quota are they assigned and utilizing? What are the filesystem and port mappings?
- you can run containers as non-root
- you can run containers in systemd timer units
- you use runC to handle seccomp
... You can do cgroups and namespaces with just systemd; but keeping chroots/images upgraded is outside the scope of systemd: where is the ideal boundary between systemd and containers?
See this comment regarding per-container MAC MCS labels: https://news.ycombinator.com/item?id=23430959
There's much additional complexity that justifies k8s / OpenShift: when would I want to manage containers with just systemd units?
> Many people might think the word “container” has a specific meaning within the Linux kernel; however the kernel has no notion of a “container”. The word has been synonymous with a variety of Linux tooling which when applied give the resemblance of what we expect a container to be.
Before LXC ( https://LinuxContainers.org ) and CNCF ( https://landscape.cncf.io/ ) and OCI ( https://opencontainers.org/ ), for shared-kernel VPS hosting ("virtual private server"; root on a shared box), there was OpenVZ (which requires a patched kernel and AFAIU still has features, like bursting, not present in cgroups).
Docker no longer has an LXC driver: libcontainer (opencontainers/runc) is the story now. The LXC docs have a great list of utilized kernel features that's also still true for docker-engine = runC + moby. The LXC docs: https://linuxcontainers.org/lxc/introduction/ :
> Current LXC uses the following kernel features to contain processes:
> ## Kernel namespaces (ipc, uts, mount, pid, network and user)
>> Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources. https://en.wikipedia.org/wiki/Linux_namespaces
> ## Apparmor and SELinux profiles https://en.wikipedia.org/wiki/AppArmor / https://en.wikipedia.org/wiki/Security-Enhanced_Linux
udica is an interesting tool for creating SELinux policies for containers.
Is it possible for each container to run confined with a different SELinux label?
> ## Seccomp policies https://en.wikipedia.org/wiki/Seccomp
See below re: Seccomp.
> ## Chroots (using pivot_root) https://en.wikipedia.org/wiki/Chroot
Chroots and symlinks, Chroots and bind mounts, Chroots and overlay filesystems, Chroots and SELinux context labels.
FWIU, Chroots are a native feature of filesystem syscalls in Fuchsia.
> ## Kernel capabilities
https://wiki.archlinux.org/index.php/Capabilities :
>> "Capabilities (POSIX 1003.1e, capabilities(7)) provide fine-grained control over superuser permissions, allowing use of the root user to be avoided. Software developers are encouraged to replace uses of the powerful setuid attribute in a system binary with a more minimal set of capabilities. Many packages make use of capabilities, such as CAP_NET_RAW being used for the ping binary provided by iputils. This enables e.g. ping to be run by a normal user (as with the setuid method), while at the same time limiting the security consequences of a potential vulnerability in ping."
> ## CGroups (control groups)* https://en.wikipedia.org/wiki/Cgroups
Control groups enable per-process (and to thus per-container) resource quotas. Other than limiting the impact of resource exhaustion, cgroups are not a security feature of the Linux kernel.
Here's a helpful explainer of the differences between some of these kernel features; which, combined, have become somewhat ubiquitous:
From "Formally add support for SELinux" (k3s #1372) https://github.com/rancher/k3s/issues/1372#issuecomment-5817... :
> https://blog.openshift.com/securing-kubernetes/*
>> The main thing to understand about SELinux integration with OpenShift is that, by default, OpenShift runs each container as a random uid and is isolated with SELinux MCS labels. The easiest way of thinking about MCS labels is they are a dynamic way of getting SELinux separation without having to create policy files and run restorecon.*
>> If you are wondering why we need SELinux and namespaces at the same time, the way I view it is namespaces provide the nice abstraction but are not designed from a security first perspective. SELinux is the brick wall that’s going to stop you if you manage to break out of (accidentally or on purpose) from the namespace abstraction.
>> CGroups is the remaining piece of the puzzle. Its primary purpose isn’t security, but I list it because it regulates that different containers stay within their allotted space for compute resources (cpu, memory, I/O). So without cgroups, you can’t be confident your application won’t be stomped on by another application on the same node.
From Wikipedia: https://en.wikipedia.org/wiki/Seccomp ::
> seccomp (short for secure computing mode) is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit(), sigreturn(), read() and write() to already-open file descriptors. Should it attempt any other system calls, the kernel will terminate the process with SIGKILL or SIGSYS.[1][2] In this sense, it does not virtualize the system's resources but isolates the process from them entirely.
... SELinux is one implementation of MAC (Mandatory Access Controls) that is built upon the LSM (Linux Security Modules) support in the Linux kernel. Some distros include policy sets for Docker hosts and lots of other packages that could be installed; see: "Formally add support for SELinux" (k3s #1372) https://github.com/rancher/k3s/issues/1372#issuecomment-5817...
How many people did it take to build the Great Pyramid?
> The potential energy of the pyramid—the energy needed to lift the mass above ground level—is simply the product of acceleration due to gravity, mass, and the center of mass, which in a pyramid is one-quarter of its height. The mass cannot be pinpointed because it depends on the specific densities of the Tura limestone and mortar that were used to build the structure; I am assuming a mean of 2.6 metric tons per cubic meter, hence a total mass of about 6.75 million metric tons. That means the pyramid’s potential energy is about 2.4 trillion joules.
In "Lost Technologies of the Great Pyramid" (2010) and "The Great Pyramid Prosperity Machine: Why the Great Pyramid was Built!" (2011), Steven Myers contends that the people who built the pyramids were master hydrologists who built a series of locks from the Nile all the way up the sides of the pyramids and pumped water up to a pool of water on the topmost level; where they used buoyancy and mechanical leverage by way of a floating barge crane in order to place blocks. This would explain how and why the pyramids are water tight, why explosive residue has been found in specific chambers, and why boats have been found buried at the bases of the pyramids.
https://www.amazon.com/dp/B0045Y26CC/
There are videos: http://www.thepump.org/video-series-2
https://www.youtube.com/playlist?list=PLt_DvKGJ_QLYvJ3IdVKXU...
I'm not aware of other explanations for how friction could have been overcome in setting the blocks such that they are watertight (in the later Egyptian pyramids).
AFAIU, the pyramids of South America appear to be of different - possibly older - construction methods.
Solar’s Future is Insanely Cheap
Power markets are more complicated than most people realize. One thing to note is that solar power can be cheaper than gas, but still not be economic. The fact of the matter is that an intermittent kWh is not as valuable as an on-demand reliable kWh to a utility who's number 1 priority is reliability. Even as solar is acquired at lower and lower prices, if evening power is generated by expensive and inefficient gas turbines, the customer might not see costs go down (and emissions might not go down either!). Solar is clearly economic in many markets, but we'll never get grid emissions in a place like California down much more without storage.
As mentioned in the article, solar power is insanely cheap. So while you are right that a kWh of solar is not, on average, as valuable as a kWh of semidispatchable coal, the insane cheapness means that at some point, it is economical to massively overbuild renewable and decommission fossil fuel plants. You provide no evidence for the implausible claim that solar causes emissions to go up: your argument is only that the emissions reducing effect is muted, which may be true.
Also, while storage would be helpful, it is not the only way to enable a renewable transition. Additional transmission is enormously helpful: as the sun goes down on California, the wind is picking up in the Midwest. And don’t forget demand response: if smart thermostats received price signals (maybe we should precool this house...) that would alleviate the evening ramp-up issue.
So I claim we’ll need less storage than “a whole day’s usage”. But the learning curve applies to batteries as well! This storage won’t cost as much anyway.
The whole issue of intermittency is overrated. While a single solar panel might generate intermittently, the solar fleet across a whole state generates more predictably.
I am predicting that grid emissions will come down a lot over the coming decades. Partially I’m predicting the past: they’ve already come down, a lot!
> if smart thermostats received price signals (maybe we should precool this house...) that would alleviate the evening ramp-up issue.
Is there an existing model for retail intraday rates? Would intraday rates be desirable for all market participants?
"Add area for curtailment data?" https://github.com/tmrowco/electricitymap-contrib/issues/236...
Demo of an OpenAI language model applied to code generation [video]
So that's basically program synthesis from natural language (ish) specifications (i.e. the comments).
I can see this being a useful tool [1]. However, I don't expect any ability for innovation. At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).
That's not to say that it can't write new code, that nobody has quite written before in the same way. But in order for a tool like this to be useful it must stick as close as possible to what is expected- or it will slow development down rather than helping it. Which means it can only do what has already been done before.
For instance- don't expect this to come up with a new sorting algorithm, out of the blue, or to be able to write good code to solve a certain problem when the majority of code solving that problem on github happens to be pretty bad.
In other words: everyone can relax. This will not take your job. Or mine.
____________
[1] I apologise to the people who know me and who will now be falling off their chairs. OK down there?
I think you are underselling the potential of a model which deeply understand programming. Imagine combining such a model with something like AutoML-Zero: https://arxiv.org/abs/2003.03384 It may not be 'creative', but used as tab-completion, it's not being rewarded or incentivized or used in any way which would expose its abilities towards creating a new sort algorithm.
I agree on the tab-completion part. Something like Gmail's smart-compose could have potentially huge benefits here.
But I'm not sure about the "deeply understand programming" part. Language modelling and "AI", in its current form, uncovers only statistical correlations and barely scratches the surface of what "understanding" is. This has restricted deployment of majority of academic research into the real-world and this, I believe, is no different and will work only in constrained settings.
Edit: typo
It would be nice to have an AI that could write unit tests, or look over your code and understand and explain where you might have bugs.
> look over your code and understand and explain where you might have bugs.
This would certainly be interesting. I'm not aware of active research going on in this area (any pointers would be helpful!).
This would require an agent to have thorough understanding of the logic you're trying to implement, and locate the piece of code where it silently fails. For this you'd again need a training dataset where the input is a piece of code and the supervision signal (the output) is location of the bug. I could imagine some sort of self-supervision to tackle this initially where you'd intentionally introduce bugs in your code to generate training data. But not sure how far this can go!
1. Generate test cases from function/class/method definitions.
2. Generate test cases from fuzz results.
3. Run tests and walk outward from symbols around relevant stacktrace frames (line numbers,).
4. Mutate and run the test again.
...
Model-based Testing (MBT) https://en.wikipedia.org/wiki/Model-based_testing
> Models can also be constructed from completed systems
> At best this is like having an exceptionally smart autocomplete function that can look up code snippets on SO for you (provided those code snippets are no longer than one line).
Yeah, all it could do for you is autocomplete around what it thinks the specification might be at that point in time.
> But what if Andy gets another dinosaur, a mean one? -- Toy Story (1995)
Future of the human climate niche
How many degrees Celsius hotter would that be for billions of people in 50 years?
> The Paris Agreement's long-term temperature goal is to keep the increase in global average temperature to well below 2 °C above pre-industrial levels; and to pursue efforts to limit the increase to 1.5 °C, recognizing that this would substantially reduce the risks and impacts of climate change. This should be done by reducing emissions as soon as possible, in order to "achieve a balance between anthropogenic emissions by sources and removals by sinks of greenhouse gases" in the second half of the 21st century. It also aims to increase the ability of parties to adapt to the adverse impacts of climate change, and make "finance flows consistent with a pathway towards low greenhouse gas emissions and climate-resilient development."
> Under the Paris Agreement, each country must determine, plan, and regularly report on the contribution that it undertakes to mitigate global warming. [6] No mechanism forces [7] a country to set a specific emissions target by a specific date, [8] but each target should go beyond previously set targets.
And then this is what was decided:
> In June 2017, U.S. President Donald Trump announced his intention to withdraw the United States from the agreement. Under the agreement, the earliest effective date of withdrawal for the U.S. is November 2020, shortly before the end of President Trump's 2016 term. In practice, changes in United States policy that are contrary to the Paris Agreement have already been put in place.[9]
https://en.wikipedia.org/wiki/Paris_Agreement
Ask HN: Best resources for non-technical founders to understand hacker mindset?
Background: technical founder wondering what reading to recommend to a business focused founder for them to grok the hacker mindset. I've thought perhaps Mythical Man Month and How To Become A Hacker (Eric Raymond essay) but not sure they're quite right.
Any suggestions?
(In case it helps an analogue in the mathematical world might be A Mathematician's Apology or Gödel, Escher, Bach.)
Let me sum up all possible books about understanding the "hacker" (terrible word by the way, because of multiple meanings, which meaning are we talking about?) mindset, to a management perspective:
1) True "hackers" value knowledge over money.
2) True "hackers" value doing things once and doing them right, no matter how long that takes. (Compare to the business mindset of "we need it now", or "we needed it yesterday")
3) True "hackers" value taking ownership in their work, that is, whatever they work on becomes an extension of themselves, much like an artist working on a work of art.
4) True "hackers" are not about work-arounds. If/when work-arounds are used, it's because there there's an artificial timeframe (as might be found in the corporate world), and there's a lack of understanding in the infrastructure which created the need for that work-around.
But, all of these virtues run counter to the demands of business, which constantly wants more things done faster, cheaper, with more features, more complexity, less testing, and doesn't want to worry about problems that may be caused by all of those things in the future (less accountability) -- as long as customer revenue can be collected today.
You see, a true "hacker's" values -- are completely different than those of big business...
And business people wonder why there's stress and burnout among tech people...
> 3) True "hackers" value taking ownership in their work, that is, whatever they work on becomes an extension of themselves, much like an artist working on a work of art
There's something to be said about owning your work, but I have to disagree that unhealthy attachment to work products is a universal attribute of technical founder hackers. It's not a kid, it's a thing that was supposed to be the best use of the resources and information available at the time.
I must have confused this point with vanity and retention in projecting my own counterproductive anti-patterns.
Prolific is not the objective for a true hacker, but not me but a guy I know mentioned something about starting projects and seeing the next 5 years of potentially happily working on that project, too.
Dissecting the code responsible for the Bitcoin halving
> The difficulty of the calculations are determined by how many zeroes need to be at the front. [...]
The difficulty is actually not determined by the number of zeroes (as was initially the case).
https://en.bitcoinwiki.org/wiki/Difficulty_in_Mining :
> The Bitcoin network has a global block difficulty. Valid blocks must have a hash below this target. Mining pools also have a pool-specific share difficulty setting a lower limit for shares.
"Less than" instead of "count leading zeroes" makes it possible for the difficulty to be less broadly adjusted in a difficulty retargeting.
Difficulty retargetings occur after up to 2016 blocks (~10 minutes, assuming the mining pool doesn't suddenly disappear resulting in longer block times that could make it take months to get to 2016 blocks according to "What would happen if 90% of the Bitcoin miners suddenly stopped mining?" https://bitcoin.stackexchange.com/questions/22308/what-would... )
Difficulty is adjusted up or down (every up to 2016 blocks) in order to keep the block time to ~10 minutes.
The block reward halving occurs every ~4 years (210,000 blocks).
Relatedly, Moore's law observes/predicts that processing power (as measured by transistor count per unit) will double every 2 years while price stays the same. Is energy efficiency independent of transistor count? https://en.wikipedia.org/wiki/Moore%27s_law
Ask HN: Does mounting servers parallel with the temperature gradient trap heat?
Heat rises. Is heat trapped in the rack? Would mounting servers sideways (vertically) allow heat to transfer out of the rack?
Many systems have taken the vertical mount approach approach over the years: Blade servers, routers, modems, and various gaming systems.
Horizontally-mounted: parallel with the floor
Vertically-mounted: perpendicular to the floor
[deleted]
Thermodynamics https://en.wikipedia.org/wiki/Thermodynamics
Are engine cylinders ever mounted horizontally? Why or why not?
> Heat rises.
Warmer air is less dense / more buoyant; so it floats.
"Does hot air really rise?" https://physics.stackexchange.com/questions/6329/does-hot-ai...
- Water ice floats because – somewhat uniquely – solid water is less dense than liquid water.
> Is heat trapped in the rack?
Probably.
> Would mounting servers sideways (vertically) allow heat to transfer out of the rack?
How could we find studies that have already tested this hypothesis?
What does the 'rc' in `.bashrc`, etc. mean?
For me, the hard part is remembering if .bashrc is supposed to source .bash_profile or vice versa.
info bash -n "Bash Startup Files"
https://www.gnu.org/software/bash/manual/html_node/Bash-Star...Google ditched tipping feature for donating money to sites
> When asked, Google confirmed that the designs were an internal idea it explored last year but decided not to pursue as part of [Google Contributor] and Google Funding Choices, which lets sites ask visitors to disable ad blockers, or instead buy a subscription or pay a per page fee to remove ads.
Could this be built on Web Monetization API (ILP (Interledger Protocol)) and e.g. Google Pay as one of many possible payment/card/cryptocurrency processing backends; just like Coil is built on Web Monetization API?
Innovating on Web Monetization: Coil and Firefox Reality
Coil: $5/mo, Content creators get a proportional cut of that amount according to what is browsed with the browser extension enabled or the Puma browser, and Private: No Tracking
> Coil sends payments via the Interledger Protocol, which allows any currency to be used for sending and receiving.
It looks like the Web Monetization API is not yet listed on the Website Monetization Wikipedia page: https://en.wikipedia.org/wiki/Website_monetization
Quoting from earlier this week:
> Web Monetization API (ILP: Interledger Protocol)
>> A JavaScript browser API which allows the creation of a payment stream from the user agent to the website.
>> Web Monetization is being proposed as a #W3C standard at the Web Platform Incubator Community Group.
> https://webmonetization.org/
> Interledger: Web Monetization API https://interledger.org/rfcs/0028-web-monetization/
Ask HN: Recommendations for online essay grading systems?
Which automated essay grading systems would you recommend? Are they open source?
How can we identify biases in these objective systems?
What are your experiences with these systems as authors and graders?
Who else remembers using the Flesch-Kincaid Grade Level metric in Word to evaluate school essays? https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readabi...
Imagine my surprise when I learned that this metric is not one that was created for authors to maximize: reading ease for the widest audience is not an objective in some deparments, but a requirement.
What metrics do and should online essay grading systems present? As continuous feedback to authors, or as final judgement?
I'm reminded of a time in highschool when an essay that I wrote was flagged by an automated essay verification engine as plagiarism. I certainly hadn't plagiarized, and it was up to me to refute that each identified keyword-similar internet resource on the internet was not an uncited source of my paper. I disengaged. I later wrote an essay about how keyword search tools could be helpful to students doing original research. True story.
Decades later, I would guess that human review is still advisable.
This need of mine to have others validate my unpaid work has nothing to do with that traumatic experience.
I still harbor this belief in myself: that what I have to say is worth money to others, and that - someday - I'll pay a journal to consider my ScholarlyArticle for publishing in their prestigious publication with maybe even threaded peer review (and #StructuredPremises linking to Datasets and CreativeWorks that my #LinkedMetaAnalyses are predicated upon). Someday, I'll develop an online persona as a scholar, as a teacher, maybe someday as a TA or an associate professor and connect my CV to any or all of the social networks for academics. I'll work to minimize the costs of interviewing and searching public records. My research will be valued and funded.
Or maybe, like 20% time, I'll find time and money on the side for such worthwhile investigations; and what I produce will be of value to others: more than just an exercise in hearing myself speak.
In my years of internet communications, I've encountered quite a few patrons; lurkers; participants; and ne'er-do-wells who'll order 5 free waters, plaster their posters to the walls, harass paying customers, and just walk out like nothing's going to happen. Moderation costs time and money; and it's a dirty job that sometimes pays okay. There are various systems for grading these comments, these essays, these NewsArticles, these ScholarlyArticles. Human review is still advisable.
> How can we identify biases in these objective systems?
Modern "journalism" recognizes that it's not a one-way monologue but a dialogue: people want to comment. Ignorantly, helpfully, relevantly, insightfully, experiencedly. What separates the "article part" from the "comments part" of the dialogue? Typesetting, CSS, citations, quality of argumentation?
Ask HN: Systems for supporting Evidence-Based Policy?
What tools and services would you recommend for evidence-based policy tasks like meta-analysis, solution criteria development, and planned evaluations according to the given criteria?
Are they open source? Do they work with linked open data?
> Ask HN: Systems for supporting Evidence-Based Policy?
> What tools and services would you recommend for evidence-based policy tasks like meta-analysis, solution criteria development, and planned evaluations according to the given criteria?
> Are they open source? Do they work with linked open data?
I suppose I should clarify that citizens, consumers, voters, and journalists are not acceptable answers
Facebook, Google to be forced to share ad revenue with Australian media
What if Google and Facebook just said “No, thank you.”? I think Google and Facebook have more power in this relationship, which might say something about the tech giants. The only possible thing the Aussies could do is block the websites, and that sounds like political suicide to me. I’m guessing the companies will decide to play ball, but if other countries start to follow suit, it might be interesting to see what happens if they start pushing back.
There would be no political suicide there. It's so easy to make Google/Facebook look evil in this case if they don't comply. Just say they are not paying taxes and getting our money abroad etc.
But there is 0% chance of Google/Facebook not complying.
> I think Google and Facebook have more power in this relationship
Not even close.
Don't the French and Spanish examples indicate otherwise? Spain's link tax in 2014 led to the shutting-down of Google News Spain, and France's attempt to charge for Google News snippets led to the removal of those snippets for French sites (followed by a dip in traffic for pubs that hurt them far more than it hurt Google). What makes you think that FB/G don't have the power here?
If you don't want them to index your content and send you free traffic, you can already specify that in your robots.txt; for free. https://en.wikipedia.org/wiki/Robots_exclusion_standard
There are no ads on Google News.
There is an apparent glut of online news: supply exceeds demand and so the price has fallen.
> There are no ads on Google News.
This is the ironic part of the policy. You're right about the glut, and there was a glut in print 20 years ago, too. Google's definitely hurt businesses, but it's usually though disintermediating them (think Yelp). Google News and search don't really do that for news--snippets and headlines aren't the same as an article. I find it hard to believe that ABC's brand isn't strong enough for them to pull their content from Google and expect people to go to abc.net.au; I just don't think than can sell enough ads to give the content away.
By hurt, do you mean competed with by effectively utilizing technology to help people find information about the world from multiple sources.
There are very many news aggregators and most do serve ads next to the headlines they index. I assume that people typically link out from news aggregation sites more than into vertically-integrated services.
Perhaps the content producers / information service providers could develop additional revenue streams in order to subsidize a news aggregation public service. Micropayments (BAT, Web Monetization (ILP)), ads, paywalls, and public and private grants are sources of revenue for content producers.
I think it's disingenuous to blame news aggregation sites for the unprofitability of extremely redundant journalism. What happened to journalism? Internet. Excessive ads. Aren't we all writers these days.
Unfortunately they killed the "most cited" and was it "most in-depth" source analysis functions of Google News; and now we're stuck with regurgitated news wires and press releases and all of these eyewitness mobile phone videos with two-bit banal commentary and also punditry. How the world has changed.
So, as far as scientific experiments are concerned, it might be interesting to see what the impact of de-listing from free time sites X, Y, and Z is.
Do the papers in Australia and France now intend to compensate journal ScholarlyArticle authors whose work they summarize and hopefully at least cite the titles and URLs of, or the journals themselves?
France rules Google must pay news firms for content
Website monetization https://en.wikipedia.org/wiki/Website_monetization
Web Monetization API (ILP: Interledger Protocol)
> A JavaScript browser API which allows the creation of a payment stream from the user agent to the website.
> Web Monetization is being proposed as a #W3C standard at the Web Platform Incubator Community Group.
Interledger: Web Monetization API https://interledger.org/rfcs/0028-web-monetization/
Khan Academy, for example, accepts BAT (Basic Attention Token) micropayments/microdonations that e.g. Brave browser users can opt to share with the content producers and indexers. https://en.wikipedia.org/wiki/Brave_(web_browser)#Basic_Atte...
Web Monetization w/ Interledger should enable any payments system with low enough transaction costs ("ledger-agnostic, currency agnostic") to be used to pay/tip/donate to content producers who are producing unsensational, unbiased content that people want to pay for.
Paywalls/subscriptions and ads are two other approaches to funding quality journalism.
Should journalists pay ScholarlyArticle authors whose studies they publish summaries of without even citing the DOI/URL and Title; or the journals said ScholarlyArticles are published in? https://schema.org/ScholarlyArticle
Adafruit Thermal Camera Imager for Fever Screening
> Thermal Camera Imager for Fever Screening with USB Video Output - UTi165K. PRODUCT ID: 4579 https://www.adafruit.com/product/4579
> This video camera takes photos of temperatures! This camera is specifically tuned to work in the 30˚C~45˚C / 86˚F~113˚ F range with 0.5˚C / 1˚ F accuracy, so it's excellent for human temperature & fever detection. In fact, this thermal camera is often used by companies/airports/hotels/malls to do a first-pass fever check: If any person has a temperature of over 99˚F an alarm goes off so you can do a secondary check with an accurate handheld temperature meter.
> You may have seen thermal 'FLIR' cameras used to find air leaks in homes, but those cameras have a very wide temperature range, so they're not as accurate in the narrow range used for fever-scanning. This camera is designed specifically for that purpose!
... USB Type-C, SD Card; no price listed yet?
The end of an Era – changing every single instance of a 32-bit time_t in Linux
Thanks!
Year 2038 Problem > Solutions is already updated re: 5.6. https://en.wikipedia.org/wiki/Year_2038_problem
Ask HN: What's the ROI of Y Combinator investments?
To calculate the ROI of YC investments, we could find the terms of the YC investments (x for y%, preference) and find the exit rate (what % of companies exit).
We could search for 'ROI of ycombinator investments' and find valuation numbers from a number of years ago.
From the first page of search results, we'd then learn about "return on capital" and how the standard YC seed terms have changed over the years.
Return on capital: https://en.wikipedia.org/wiki/Return_on_capital
From the See also section of this Wikipedia page, we might discover "Cash flow return on investment" and "Rate of return on a portfolio"
From the "rate of return" Wikipedia page, we might learn that "The return on investment (ROI) is return per dollar invested. It is a measure of investment performance, as opposed to size (c.f. return on equity, return on assets, return on capital employed)." and that "The annualized return of an investment depends on whether or not the return, including interest and dividends, from one period is reinvested in the next period. " https://en.wikipedia.org/wiki/Rate_of_return
From the YCombinator Wikipedia page, we might read that "The combined valuation of the top YC companies was over $155 billion as of October, 2019. [4]" and that "As of late 2019, Y Combinator had invested in >2,000 companies [37], most of which are for-profit. Non-profit organizations can also participate in the main YC program. [38]" and then read about "seed accelerators" and then "business incubators" in search of appropriate metrics for comparing VC performance. https://en.wikipedia.org/wiki/Y_Combinator
ROI is such a frou frou statistic anyway. What does that even mean, ROI? In any case, YC itself is not a public company, per se, AFAICT, so, it's not so easy as going to https://YCharts.com, entering the equity symbol, clicking on "Key Stats", and scrolling down to "Profitability" to review [Gross | EBITDA | Operating] [Profit] Margin.
The LTSE (Long-Term Stock Exchange) is where people who are in this for real are really doing it now.
Microsoft announces Money in Excel powered by Plaid
This looks really useful.
At first glance, I found a number of ways to push transaction data into Google Sheets from the Plaid API:
build-your-own-mint (NodeJS, CircleCI) https://github.com/yyx990803/build-your-own-mint
go-plaid: https://github.com/ebcrowder/go-plaid
Presumably, like the GOOGLEFINANCE function, there's some way to pull data from an API with just Apps Script (~JS) without an auxiliary serverless function to get the txs from Plaid and post to the gsheets API?
Lora-based device-to-device smartphone communication for crisis scenarios [pdf]
Sudomesh has been working on one of these devices, disaster radio: https://disaster.radio
I like the solar integration. My ideal product would be a programmable LoRa transceiver integrates into a waterproof battery bank with solar, gps and a small touch screen. Should be about the size of a large handheld which could mounted on a house, tree, mast or carry it in a pocket/bag. Apps could be, SOS messages, chat and weather information. If it’s extensible (e.g. something like an app store or package manager), I’m sure folks will dream up additional uses.
Sort of like a more hackable/accessible phone and long range radio.
Unfortunately, the Earl tablet never made it to market: https://blog.the-ebook-reader.com/2015/01/26/video-update-ab...
Earl specs: Waterproof; Solar charging; eInk; ANT+; NFC; VHF/UHF transceiver (GMRS, PMR446, UHFCB); GPS; Sensors: Accelerometer, Gyroscope, Magnetometer, Temperature, Barometer, Humidity; AM/FM/SW/LW/WB
LTE, LoRa, 5G, and Hostapd would be great
Being able to plug it into a powerbank and antennas for use as a fixed or portable e.g. BATMAN mesh relay would be great
"LoRa+WiFi ClusterDuck Protocol by Project OWL for Disaster Relief" https://news.ycombinator.com/item?id=22707267
> An opkg (for e.g. OpenWRT) with this mesh software would make it possible to use WiFi/LTE routers with a LoRa transmitter/receiver connected over e.g. USB or Mini-PCIe.
LoRa+WiFi ClusterDuck Protocol by Project OWL for Disaster Relief
> Project OWL (Organization, Whereabouts, and Logistics) creates a mesh network of Internet of Things (IoT) devices called DuckLinks. These Wi-Fi-enabled devices can be deployed or activated in disaster areas to quickly re-establish connectivity and improve communication between first responders and civilians in need.
> In OWL, a central portal connects to solar- and battery-powered, water-resistant DuckLinks. These create a Local Area Network (LAN). In turn, these power up a Wi-Fi captive portal using low-frequency Long-range Radio (LoRa) for Internet connectivity. LoRA has a greater range, about 10km, than cellular networks.
...
> You don't actually need a DuckLink device. The open-source OWL firmware can quickly turn a cheap wireless device into a DuckLink using the -- I swear I'm not making this up -- ClusterDuck Protocol. This is a mesh network node, which can hook up to any other near-by Ducks.
> OWL is more than just hardware and firmware. It's also a cloud-based analytic program. The OWL Data Management Software can be used to facilitate organization, whereabouts, and logistics for disaster response.
Homepage: http://clusterduckprotocol.org/
GitHub: https://github.com/Code-and-Response/ClusterDuck-Protocol
The Linux Foundation > Code and Response https://www.linuxfoundation.org/projects/code-and-response/
An opkg (for e.g. OpenWRT) with this mesh software would make it possible to use WiFi/LTE routers with a LoRa transmitter/receiver connected over e.g. USB or Mini-PCIe.
... cc'ing from https://twitter.com/westurner/status/1238859774567026688 :
OpenWRT is a Make-based embedded Linux distro w/ LuCI (Lua + JSON + UCI) web interface).
#OpenWRT runs on RaspberryPis, ARM, x86, ARM, MIPS; there's a Docker image. OpenWRT Supported Devices: https://openwrt.org/supported_devices
OpenWRT uses opkg packages: https://openwrt.org/docs/guide-user/additional-software/opkg
I searched for "Lora" in OpenWRT/packages: lora-gateway-hal opkg package: https://github.com/openwrt/packages/blob/master/net/lora-gat...
lora-packet-forwarder opkg package (w/ UCI integration): https://github.com/openwrt/packages/pull/8320
https://github.com/xueliu/lora-feed :
> Semtech packages and ChirpStack [(LoRaserver)] Network Server stack for OpenWRT
> > [In addition to providing node2node/2net connectivity, #batman-adv can bridge VLANs over a mesh (or link), such as for “trusted” client, guest, IoT, and mgmt networks. It provides an easy-to-configure alternative to other approaches to “backhaul”, […]] https://openwrt.org/docs/guide-user/network/wifi/mesh/batman
> I have a few different [quad-core, MIMO] ARM devices without 4G. TIL that the @GLiNetWifi devices ship with OpenWRT firmware (and a mobile config app) and some have 1-2 (Mini-PCIe) 4G w/ SIM slots. Also, @turris_cz has OpenWRT w/ LXC in the kernel build. https://t.co/Rz0Uu5uHJQ
A Visual Debugger for Jupyter
Source: jupyterlab/debugger https://github.com/jupyterlab/debugger
When will jupyter have "highlight and execute" functionality? The cell concept is fine, but I'm constantly copy pasting snippets of code into new cells to get that "incremental" coding approach...
So, I went looking for the answer to this because in the past I've installed the scratchpad extension by installing jupyter_contrib_nbextensions, but those don't work with JupyterLab because there's a new extension model for JupyterLab that requires node and npm.
Turns out that with JupyterLab, all you have to to is right-click and select "New Console for Notebook" and it opens a console pane below the notebook already attached to the notebook kernel. You can also instead do File > New > Console and select a kernel listed under "Use Kernel From Other Session".
The "New action runInConsole to allow line by line execution of cell content" "PR adds a notebook command `notebook:run-in-console`" but you have to add the associated keyboard shortcut to your config yourself; e.g. `Ctrl Shift Enter` or `Ctrl-G` that calls `notebook:run-in-console`. https://github.com/jupyterlab/jupyterlab/pull/4330
"In Jupyter Lab, execute editor code in Python console" describes how to add the associated keyboard shortcut to your config: https://stackoverflow.com/questions/38648286/in-jupyter-lab-...
Ask HN: What's the Equivalent of 'Hello, World' for a Quantum Computer?
The 'Hello,World' program is one of the simplest programs to demonstrate how to go about writing a program in a new programming language.
What is an equivalent simple program which demonstrates how to write a very simple program for a quantum computer?
I have tried (and failed) to imagine such a program. Can somebody who has actually used a quantum computer show us an actual quantum computer program?
"Getting Started with Qiskit" https://qiskit.org/documentation/getting_started.html
Qiskit / qiskit-community-tutorials > "1 Hello, Quantum World with Qiskit" https://github.com/Qiskit/qiskit-community-tutorials#1-hello...
"Quantum basics with Q#" https://docs.microsoft.com/en-us/quantum/quickstart?tabs=tab...
Qutip notebooks: https://github.com/qutip/qutip#run-notebooks-online
Jupyter Notebooks labeled quantum-computing: https://github.com/topics/quantum-computing?l=jupyter+notebo...
Ask HN: Communication platforms for intermittent disaster relief?
Are there good platforms for disaster relief that work well with intermittent connectivity (i.e. spotty 3G/4G/WiFi/LoRa)?
How can major networks improve in terms of e.g. indicating message delivery status, most recent sync time, sync throttling status due to load, optionally downloading images/audio/video, referring people to local places and/or forms for help with basic needs, etc?
What are some tools that app developers can use to simulate intermittent connectivity when running tests?
DroneAid: A Symbol Language and ML model for indicating needs to drones, planes
From the README https://github.com/Code-and-Response/DroneAid :
> The DroneAid Symbol Language provides a way for those affected by natural disasters to express their needs and make them visible to drones, planes, and satellites when traditional communications are not available.
> Victims can use a pre-packaged symbol kit that has been manufactured and distributed to them, or recreate the symbols manually with whatever materials they have available.
> These symbols include those below, which represent a subset of the icons provided by The United Nations Office for the Coordination of Humanitarian Affairs (OCHA). These can be complemented with numbers to quantify need, such as the number or people who need water.
Each of the symbols are drawn within a triangle pointing up:
- Immediate Help Needed (orange; downward triangle \n SOS),
- Shelter Needed (cyan; like a guy standing in a tall pentagon without a floor),
- OK: No Help Needed (green; upward triangle \n OK),
- First Aid Kit Needed (yellow; briefcase with a first aid cross),
- Water Needed (blue; rain droplet), Area with Children in Need (lilac; baby looking thing with a diaper on),
- Food Needed (red; pan with wheat drawn above it),
- Area with Elderly in Need (purple; person with a cane)
So, we're going to need some artists; something to write large things with; some orange, cyan, green, yellow, blue, lilac, red, and purple things; some people who can tell me the difference between lilac (light purple: babies) and purple (darker purple: old people); and some drones that can capture location and imagery.
Note that DroneAid is also a project of The Linux Foundation Code and Response organization.
Ask HN: Computer Science/History Books?
Hi guys, can you recommend interesting books on Computer Science or computer history (similar to Dealers of Lightning) to read on this quarantine times? I really like that subject and am looking for something to keep myself away from TV at night.
Thank you.
The Information: a History, a Theory, a Flood by James Gleick is great. Starts with Ada Lovelace/Charles Babbage and goes on from there, I found it fascinating.
"The Information: A History, a Theory, a Flood" starts with "1 | Drums That Talk" re: African drum messaging; a complex coding scheme:
> Here was a messaging system that outpaced the best couriers, the fastest horses on good roads with way stations and relays.
https://en.wikipedia.org/wiki/The_Information:_A_History,_a_...
https://www.goodreads.com/book/show/8701960-the-information
From "Polynesian People Used Binary Numbers 600 Years Ago" https://www.scientificamerican.com/article/polynesian-people... :
> Binary arithmetic, the basis of all virtually digital computation today, is usually said to have been invented at the start of the eighteenth century by the German mathematician Gottfried Leibniz. But a study now shows that a kind of binary system was already in use 300 years earlier among the people of the tiny Pacific island of Mangareva in French Polynesia.
Open-source security tools for cloud and container applications
It's odd they don't list OPA Gatekeeper, which is probably the best tool for enforcing security and other best practices in Kubernetes clusters.
List of CNCF open source security projects without the blog post: https://landscape.cncf.io/category=security-compliance&forma...
> List of CNCF open source security projects without the blog post: https://landscape.cncf.io/category=security-compliance&forma...
Thanks for this.
YC Companies Responding to Covid-19
Are life sciences and healthcare familiar verticals for YC?
Good to see money and talent going to such good use.
(Edit) Here's the YC Companies list; which doesn't yet list these new investments:
Biomedical vertical: https://www.ycombinator.com/companies/?vertical=Biomedical
Healthcare vertical: https://www.ycombinator.com/companies/?vertical=Healthcare
"The Y Combinator Database" https://www.ycdb.co/
Show HN: Neh – Execute any script or program from Nginx location directives
And we've come back round full circle from CGI scripts, although separating out headers and body on different fds sounds neat
Many things we’ve dispensed with long ago seem to be forgotten and the younger coders are learning for themselves the hard way. Aka the same way we did.
One of the things I'm wondering however: How popular is CGI if I don't find it with the query "run script on nginx location directive".
Nginx probably somewhat-deliberately has FastCGI but not regular CGI for a number of reasons.
CGI has process-per-request overhead.
CGI typically runs processes as the user the webserver is running as; said processes can generally read and write to the unsandboxed address space of the calling process (such as x.509 private certs).
Just about any app can be (D)DOS'd. That requires less resources with the process-per-request overhead of CGI.
In order to prevent resource exhaustion due to e.g someone benignly hitting reload a bunch of times and thus creating multiple GET requests, applications should enqueue task messages which a limited number of workers retrieve from a (durable) FIFO or priority queue and update the status of.
Websockets may or may not scale better than long-polling for streaming stdout to a client.
Interesting and really informative! Thanks for sharing!
Ask HN: How can a intermediate-beginner learn Unix/Linux and programming?
For a long time, I’ve been in an awkward position with my knowledge of computers. I know basic JavaScript (syntax and booleans and nothing more). I’ve learned the bare basics of Linux from setting up a Pi-hole. I understand the concept of secure hashes. I even know some regex.
The problem is, I know so little that I can’t actually do anything with this knowledge. I suppose I’m looking for a tutorial that will teach me to be comfortable with the command line and a Unix environment, while also teaching me to code a language. Where should I start?
a really good book is Richard Stevens Advanced Programming in the UNIX environment[1]. (sounds daunting but it's not too bad when you combine it with another introductory text on C). If you stick with C you'll eventually know UNIX/Linux a lot deeper than anyone. It takes time though so go easy on yourself.
Also check github for a bunch of repos that contain biolerplate code that is used in most deamons illustrating signal handling, forking, etc.[2]
Also I suggest taking some open source projects from Daniel Bernstein (DJB) as examples on how to write secure code. qmail, djbdns, daemontools ... there are lots of great ideas there[3]
Write a couple of simple programs that utilize your code and expand from there, learn the plumbing, e.g. write a Makefile, learn autotools/autoconf, how to write tests, how to use gdb, how to create shared libs and link a program LD_LIBRARY_PATH/ldconfig, etc ...
Most important think about something that you like to write and go from there. Back in the late 90ies I studied RFC's (MIME, SMTP, etc) and then implemented things that I could actually use myself. Here is a recent project I did some weeks ago (an extreme edge-case for security maximalists which will never be used by anyone other then myself, ... but I really enjoyed writing this and learned a bunch of new stuff while doing so)[4]
if you need ideas or help with looking at your code, don't hesitate and send me a mail.
[1] https://en.wikipedia.org/wiki/Advanced_Programming_in_the_Un...
> Also check github for a bunch of repos that contain biolerplate code that is used in most deamons illustrating signal handling, forking, etc.[2]
docker-systemctl-replacement is a (partial) reimplementation of systemd as one python script that can be run as the init process of a container that's helpful for understanding how systemd handles processes: https://github.com/gdraheim/docker-systemctl-replacement/blo...
systemd is written in C: https://github.com/systemd/systemd
> examples of how to write secure code
awesome-safety-critical > Coding Guidelines https://github.com/stanislaw/awesome-safety-critical/blob/ma...
Math Symbols Explained with Python
Average of a finite series: There's a statistics module in Python 3.4+:
X = [1, 2, 3]
from statistics import mean, fmean
mean(X)
# may or may not be preferable to
sum(X) / len(X)
https://docs.python.org/3/library/statistics.html#statistics...Product of a terminating iterable:
import operator
from functools import reduce
# from itertools import accumulate
reduce(operator.mul, X)
Vector norm: from numpy import linalg as LA
LA.norm(X)
https://docs.scipy.org/doc/numpy/reference/generated/numpy.l...Function domains and ranges can be specified and checked at compile-time with type annotations or at runtime with type()/isinstance() or with something like pycontracts or icontracts for checking preconditions and postconditions.
Dot product:
Y = [4, 5, 6]
np.dot(X, Y)
https://docs.scipy.org/doc/numpy/reference/generated/numpy.d...Unit vector:
X / np.linalg.norm(X)Ask HN: Is there way you can covert smartphone to a no contact thermometer?
Wondering is there an infrared dongle that can convert your phone to a no contact thermometer to read body temperature?
Infrared thermometer: https://en.wikipedia.org/wiki/Infrared_thermometer
Thermography: https://en.wikipedia.org/wiki/Thermography
IDK what the standard error is for medical temperature estimation with an e.g. FLIR ONE thermal imaging camera for an Android/iOS device. https://www.flir.com/applications/home-outdoor/
I'd imagine that sanitization would be crucial for any clinical setting.
(Edit) "Prediction of brain tissue temperature using near-infrared spectroscopy" (2017) Neurophotonics https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5469395/
"Nirs body temperature" https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=nir...
"Infrared body temperature" https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=inf...
"Infrared thermometer iOS" https://m.alibaba.com/trade/search?SearchText=infrared%20the...
"Infrared thermometer Android" https://alibaba.com/trade/search?SearchText=infrared%20therm...
Employee Scheduling
From "Ask HN: What algorithms should I research to code a conference scheduling app" https://news.ycombinator.com/item?id=15267804 :
> Resource scheduling, CSP (Constraint Satisfaction programming)
CSP: https://en.wikipedia.org/wiki/Constraint_satisfaction_proble...
Scheduling (production processes):
https://en.wikipedia.org/wiki/Scheduling_(production_process...
Scheduling (computing):
https://en.wikipedia.org/wiki/Scheduling_(computing)
... To an OS, a process thread has a priority and sometimes a CPU affinity.
From http://markmail.org/search/?q=list%3Aorg.python.omaha+pysche... :
Pyschedule:
- Src: https://github.com/timnon/pyschedule
From https://github.com/timnon/pyschedule :
> pyschedule is python package to compute resource-constrained task schedules. Some features are:
- precedence relations: e.g. task A should be done before task B
- resource requirements: e.g. task A can be done by resource X or Y
- resource capacities: e.g. resource X can only process a few tasks
Previous use-cases include:
- school timetables: assign teachers to classes
- beer brewing: assign equipment to brewing stages
- sport schedules: assign stadiums to games
... https://en.wikipedia.org/wiki/Slurm_Workload_Manager :
> Slurm is the workload manager on about 60% of the TOP500 supercomputers.[1]
Slurm uses a best fit algorithm based on Hilbert curve scheduling or fat tree network topology in order to optimize locality of task assignments on parallel computers.[2]
... https://en.wikipedia.org/wiki/Hilbert_curve_scheduling :
> [...] the Hilbert curve scheduling method turns a multidimensional task allocation problem into a one-dimensional space filling problem using Hilbert curves, assigning related tasks to locations with higher levels of proximity.[1] Other space filling curves may also be used in various computing applications for similar purposes.[2]
Show HN: Simulation-based high school physics course notes
I love the train demo for general & special relativity, they intuitively explain the theories from frames of reference (ground/train).
This helped me instantly understand the 2 theories, great work!
WebAssembly brings extensibility to network proxies
FWIW, Ethereum WASM (ewasm) has a cost (in "particles" ("gas")) for each WebAssembly opcode. [1]
Opcode costs help to incentivize efficient code.
ewasm/design /README.md [2] links to the complete WebAssembly instruction set. [3]
[1] https://ewasm.readthedocs.io/en/mkdocs/determining_wasm_gas_...
[2] https://github.com/ewasm/design/blob/master/README.md
[3] https://webassembly.github.io/spec/core/appendix/index-instr...
Pandemic Ventilator Project
> https://www.projectopenair.org/
From https://app.jogl.io/project/121#about
>> Current Status of the project
>> The main bottleneck currently (2020-03-13) is organization / management.
>> […] This is an organization of experts and hobbyists from around the globe.
The link https://app.jogl.io/project/121#about doesn't go anywhere anymore. And there are no results for "ventilator" in the app.jogl.io search.
I see a "Ventilator Project" heading?
(edit) here's the link to their 'Ventilator' document: https://docs.google.com/document/d/1RDihfZIOEYs60kPEIVDe7gms...
Low-cost ventilator wins Sloan health care prize (2019)
Relating to coronavirus (COVID-19), if you require a ventilator, your survival chances are already slim.
> The median time from illness onset (ie, before admission) to discharge was 22·0 days (IQR 18·0–25·0), whereas the median time to death was 18·5 days (15·0–22·0; table 2). 32 patients required invasive mechanical ventilation, of whom 31 (97%) died. The median time from illness onset to invasive mechanical ventilation was 14·5 days (12·0–19·0). Extracorporeal membrane oxygenation was used in three patients, none of whom survived. Sepsis was the most frequently observed complication, followed by respiratory failure, ARDS, heart failure, and septic shock (table 2). Half of non-survivors experienced a secondary infection, and ventilator-associated pneumonia occurred in ten (31%) of 32 patients requiring invasive mechanical ventilation. The frequency of complications were higher in non-survivors than survivors (table 2)
https://www.thelancet.com/action/showPdf?pii=S0140-6736%2820...
Ventilator availability is limiting our ability to get care to the most people we can.
True, but at least Italy claims that the main bottleneck in ventilator availability isn't in the machines themselves but in trained doctors and nurses to support patients on ventilators.
AI can detect coronavirus from CT scans in twenty seconds
Is it possible to detect coronavirus with NIRS (Near-Infrared Spectroscopy)? https://en.wikipedia.org/wiki/Near-infrared_spectroscopy
FWIU, the equipment costs and scan times are lower with NIRS than with CT or MRI? And infrared is zero rads?
(Edit) I think it was this or the TED video that had the sweet demo: "The Science of Visible Thought & Our Translucent Selves | Mary Lou Jepsen | SU Global Summit" https://youtu.be/IRCXNBzfeC4
Are these devices in production?
AutoML-Zero: Evolving machine learning algorithms from scratch
Next:
- Autosuggest database tables to use
- Automatically reserve parallel computing resources
- Autodetect data health issues and auto fix them
- Autodetect concept drift and auto fix it
- Auto engineer features and interactions
- Autodetect leakage and fix it
- Autodetect unfairness and auto fix it
- Autocreate more weakly-labelled training data
- Autocreate descriptive statistics and model eval stats
- Autocreate monitoring
- Autocreate regulations reports
- Autocreate a data infra pipeline
- Autocreate a prediction serving endpoint
- Auto setup a meeting with relevant stakeholders on Google Calendar
- Auto deploy on Google Cloud
- Automatically buy carbon offset
- Auto fire your in-house data scientists
Would be funny but most of those things are already on AutoML Tables, including the carbon offset
> Would be funny but most of those things are already on AutoML Tables, including the carbon offset
GCP datacenters are 100% offset with PPAs. Are you referring to different functionality for costing AutoML instructions in terms of carbon?
...
I'd add:
- Setup a Jupyter Notebook environment
> Jupyter Notebooks are one of the most popular development tools for data scientists. They enable you to create interactive, shareable notebooks with code snippets and markdown for explanations. Without leaving Google Cloud's hosted notebook environment, AI Platform Notebooks, you can leverage the power of AutoML technology.
> There are several benefits of using AutoML technology from a notebook. Each step and setting can be codified so that it runs the same every time by everyone. Also, it's common, even with AutoML, to need to manipulate the source data before training the model with it. By using a notebook, you can use common tools like pandas and numpy to preprocess the data in the same workflow. Finally, you have the option of creating a model with another framework, and ensemble that together with the AutoML model, for potentially better results.
https://cloud.google.com/blog/products/ai-machine-learning/u...
This sounds like the sort of thing that would be useful outside of data science. Which leads to the question of whether it needs to be generalized, or redone differently for different specializations. Which in turn seems like the sort of question that it's tricky to answer with AI.
> This sounds like the sort of thing that would be useful outside of data science.
The instruction/operation costing or the computational essay/notebook environment setup?
Ethereum ("gas") and EOS have per-instruction costing. SingularityNET is a marketplace for AI solutions hosted on a blockchain, where you pay for AI/ML services with the SingularityNET AGI token. E.g. GridCoin and CureCoin compensate compute resource donations with their own tokens; which also have a floating exchange rate.
TLJH: "The Littlest JupyterHub" describes how to setup multi-user JupyterHub with e.g. Docker spawners that isolate workloads running with shared resources like GPUs and TPUs: http://tljh.jupyter.org/en/latest/
"Zero to BinderHub" describes how to setup BinderHub on a k8s cluster: https://binderhub.readthedocs.io/en/latest/zero-to-binderhub...
The notebook/procedure thing. Like, doesn't everybody everywhere operate on a basis of mixed manual/automated procedures, where it needs to fluidly transition from one to another, yet be controlled and recorded and verified and structured?
REES is one solution to reproducibility of the computational environment.
> BinderHub ( https://mybinder.org/ ) creates docker containers from {git repos, Zenodo, FigShare,} and launches them in free cloud instances also running JupyterLab by building containers with repo2docker (with REES (Reproducible Execution Environment Specification)). This means that all I have to do is add an environment.yml to my git repo in order to get Binder support so that people can just click on the badge in the README to launch JupyterLab with all of the dependencies installed.
> REES supports a number of dependency specifications: requirements.txt, Pipfile.lock, environment.yml, aptSources, postBuild. With an environment.yml, I can install the necessary CPython/PyPy version and everything else.
REES: https://repo2docker.readthedocs.io/en/latest/specification.h...
REES configuration files: https://repo2docker.readthedocs.io/en/latest/config_files.ht...
Storing a container built with repo2docker in a container registry is one way to increase the likelihood that it'll be possible to run the same analysis pipeline with the same data and get the same results years later.
...
Pachyderm ( https://pachyderm.io/platform/ ) does Data Versioning, Data Pipelines (with commands that each run in a container), and Data Lineage (~ "data provenance"). What other platforms are there for versioning data and recording data provenance?
...
Recording manual procedures is an area where we've somewhat departed from the "write in a lab notebook with a pen" practice. CoCalc records all (collaborative) inputs to the notebook with a timeslider for review.
In practice, people use notebooks for displaying generated charts, manual exploratory analyses (which does introduce bias), for demonstrating APIs, and for teaching.
Is JupyterLab an ideal IDE? Nope, not by a longshot. nbdev makes it easier to write a function in a notebook, sync it to a module, edit it with a more complete data-science IDE (like RStudio, VSCode, Spyder, etc), and then copy it back into the notebook. https://github.com/fastai/nbdev
> What other platforms are there for versioning data and recording data provenance?
Quilt also versions data and data pipelines: https://medium.com/pytorch/how-to-iterate-faster-in-machine-...
https://github.com/quiltdata/quilt (Python)
Poor data scientists, now whose heads get cut when things go wrong and companies lose billions?
In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.
“What are you doing?”, asked Minsky.
“I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.
“Why is the net wired randomly?”, asked Minsky.
“I do not want it to have any preconceptions of how to play”, Sussman said.
Minsky then shut his eyes.
“Why do you close your eyes?”, Sussman asked his teacher.
“So that the room will be empty.”
At that moment, Sussman was enlightened.
Is this an argument in favor of unjustified magic constant arbitrary priors?
A sufficiently large amount of random data contains all the magic constants you could want.
AutoML-Zero aims to automatically discover computer programs that can solve machine learning tasks, starting from empty or random programs and using only basic math operations.
If this system is not using human bias, who is it choosing what good program is? Surely, human labeling data involves humans adding their bias to the data?
It seems like AlphaGoZero was able to do just end-to-end ML because it was able to use a very clear and "objective" standard, whether a program wins or loses at the game of Go.
Would this approach only deal with similarly unambiguous problems?
Edit: also, AlphaGoZero was one of the most ML ever created (at least at the time of its creation). How much computing resources would this require for more fully general learning? Will there be a limit to such an approach?
Options for giving math talks and lectures online
One option: screencast development of a Jupyter notebook.
Jupyter Notebook supports LaTeX (MathTeX) and inline charts. You can create graded notebooks with nbgrader and/or with CoCalc (which records all (optionally multi-user) input such that you can replay it with a time slider).
Jupyter notebooks can be saved to HTML slides with reveal.js, but if you want to execute code cells within a slide, you'll need to install RISE: https://rise.readthedocs.io/en/stable/
Here are the docs for CoCalc Course Management; Handouts, Assignments, nbgrader: https://doc.cocalc.com/teaching-course-management.html
Here are the docs for nbgrader: https://nbgrader.readthedocs.io/en/stable/
You can also grade Jupyter notebooks in Open edX:
> Auto-grade a student assignment created as a Jupyter notebook, using the nbgrader Jupyter extension, and write the score in the Open edX gradebook
https://github.com/ibleducation/jupyter-edx-grader-xblock
Or just show the Jupyter notebook within an edX course: https://github.com/ibleducation/jupyter-edx-viewer-xblock
There are also ways to integrate Jupyter notebooks with various LMS / LRS systems (like Canvas, Blackboard, etc) "nbgrader and LMS / LRS; LTI, xAPI" on the "Teaching with Jupyter Notebooks" mailing list: https://groups.google.com/forum/#!topic/jupyter-education/_U...
"Teaching and Learning with Jupyter" ("An open book about Jupyter and its use in teaching and learning.") https://jupyter4edu.github.io/jupyter-edu-book/
> TLJH: "The Littlest JupyterHub" describes how to setup multi-user JupyterHub with e.g. Docker spawners that isolate workloads running with shared resources like GPUs and TPUs: http://tljh.jupyter.org/en/latest/
> "Zero to BinderHub" describes how to setup BinderHub on a k8s cluster: https://binderhub.readthedocs.io/en/latest/zero-to-binderhub...
If you create a git repository with REES-compatible dependency specification file(s), students can generate a container with all of the same software at home with repo2docker or with BinderHub.
> REES is one solution to reproducibility of the computational environment.
>> BinderHub ( https://mybinder.org/ ) creates docker containers from {git repos, Zenodo, FigShare,} and launches them in free cloud instances also running JupyterLab by building containers with repo2docker (with REES (Reproducible Execution Environment Specification)). This means that all I have to do is add an environment.yml to my git repo in order to get Binder support so that people can just click on the badge in the README to launch JupyterLab with all of the dependencies installed.
>> REES supports a number of dependency specifications: requirements.txt, Pipfile.lock, environment.yml, aptSources, postBuild. With an environment.yml, I can install the necessary CPython/PyPy version and everything else.
> REES: https://repo2docker.readthedocs.io/en/latest/specification.h...
> REES configuration files: https://repo2docker.readthedocs.io/en/latest/config_files.ht...
> Storing a container built with repo2docker in a container registry is one way to increase the likelihood that it'll be possible to run the same analysis pipeline with the same data and get the same results years later.
Aerogel from fruit biowaste produces ultracapacitors
> "Aerogel from fruit biowaste produces ultracapacitors with high energy density and stability" (2020) https://www.sciencedirect.com/science/article/pii/S2352152X1...
Years ago, I remember reading about supercapacitor electrodes made from what would be waste hemp bast fiber. They used graphene as a control. And IIRC, the natural branching structure in hemp (the strongest natural fiber) was considered ideal for an electrode.
"Hemp Carbon Makes Supercapacitors Superfast" https://www.asme.org/topics-resources/content/hemp-carbon-ma...
How do the costs and performance compare? Graphene, hemp, durian, jackfruit
While graphene production costs have fallen due to lots of recent research, IIUC all graphene production is hazardous due to graphene's ability to cross the lungs and the blood-brain barrier?
All my life, I've heard people go on and on about the magical properties of hemp.
It's hilarious.
They have these material science level arguments about hemp, yet the same people would be hard pressed to find an alternative use for cotton aside from clothes, or wood pulp aside from ikea furniture.
It's a desert topping, it's a driveway sealant. And the government won't let us have it!
Hemp textiles are rough, but antimicrobial/antibacterial: hemp textiles resist growth of pneumonia and staph.
AFAIU, when they blend hemp with e.g. rayon it's good enough for underwear, sheets, scrubs.
The government is getting the heck out of the way of hemp, a great rotation crop that can be used for soul remediation.
damn if hemp doesn't make a come back.. from cbd to ultracaps..
Hopefully but it’s only been legal to grow hemp in the US since 2018. https://www.vox.com/the-goods/2018/12/13/18139678/cbd-indust...
Technically, the 2013 farm bill (signed into law in 2014) authorized growing hemp for state-registered research purposes. https://www.votehemp.com/laws-and-legislation/federal-legisl...
Turns out UC Berkeley's got an approach for brewing cannabinoids (and I think terpenes) from yeast, and a company in Germany has a provisional patent application to brew cannabinoids from bacteria. We could be absorbing carbon ("sequestering" carbon) and coal ash acid rain with mostly fields of industrial hemp for which there are indeed thousands of uses.
Ask HN: How to Take Good Notes?
I want to improve my note-taking skill. I've started writing a text file with notes from class, however, I don't have a systematic way of writing. This means at this point I just wrote down, arbitrarily, things the professor said, things the professor wrote, how I understood the information, and everything else, mostly all over the place.
I'm wondering if anyone developed a system like this I could adapt to myself, and how did they do it.
I think you ought to consider whether you should take notes at all. Notetaking is great for remembering actions that you have comitted to doing, or if you need to spread information to people who didn't participate in a meeting. Managers need to do a lot of notetaking.
However, taking notes seriously hinder your ability to engage with the material and build true understanding as you are listening, which would have helped you remember the material right away. If you are in school or are an individual contributor in a company I think you ought to stop taking notes all together.
If you need notes for future practice I would advice you to write them after the meeting/lecture. Actively recalling things from memory is the best form practice.
> In 2009, psychologist Jackie Andrade asked 40 people to monitor a 2-½ minute dull and rambling voice mail message. Half of the group doodled while they did this (they shaded in a shape), and the other half did not. They were not aware that their memories would be tested after the call. Surprisingly, when both groups were asked to recall details from the call, those that doodled were better at paying attention to the message and recalling the details. They recalled 29% more information! https://www.health.harvard.edu/blog/the-thinking-benefits-of...
https://en.wikipedia.org/wiki/Doodle#Effects_on_memory references the same study.
Related articles on GScholar: https://scholar.google.com/scholar?q=related:YVG_-PKhNH4J:sc...
> dull and rambling voice mail message
This is in no way a realistic study. A dull and rambling voice speaking about some random thing not related to you or your work will make people disengage. Doodling presumably keeps people from totally spacing out.
Many lectures and meetings may be experienced as similarly dross and irrelevant and a waste of time (though you can't expect people to just read the necessary information ahead of time, as flipped classrooms expect of committed learners).
What would be a better experimental design for measuring effect on memory retention of passively-absorbed lectures?
Ask HN: STEM toy for a 3 years old?
Hello! Can the HN community recommend me a STEM toy (or similar that would educate and entertain him) for my 3 yo boy? He's highly curious but I can't find many things to play with him :( The things that I like bore him and the things that he likes bore me (or are way too messy and dangerous to let him do them)...
"12 Awesome (& Educational) STEM Subscription Boxes for Kids" https://stemeducationguide.com/subscription-boxes-for-kids/
Tape measure with big numbers, ruler(s)
Measuring cup, water, ice.
"Melissa & Doug Sunny Patch Dilly Dally Tootle Turtle Target Game (Active Play & Outdoor, Two Color Self-Sticking Bean Bags, Great Gift for Girls and Boys - Best for 3, 4, 5, and 6 Year Olds)"
Set of wooden blocks in a wood box; such as "Melissa & Doug Standard Unit Blocks"
...
https://sugarlabs.org/ , GCompris mouse and keyboard games with a trackpad and a mouse, ABCMouse, Khan Academy Kids, Code.org, ScratchJr (5-7), K12 Computer Science Framework https://k12cs.org/
OpenAPI v3.1 and JSON Schema 2019-09
In all honesty, isn't json schema and open api more or less reinventing wsdl/soap/rpc? It feels like we've come full circle now and replaced <> with {} (and on the way lost a lot of mature xml tooling).
Defusedxml lists a number of XML "Attack vectors: billion laughs / exponential entity expansion, quadratic blowup entity expansion, external entity expansion (remote), external entity expansion (local file), DTD retrieval" https://pypi.org/project/defusedxml/
Are there similar vulnerabilities in JSON parsers (that would then also need to be monkeypatched)?
Sure, but the page itself says, that it's mostly based on some uncommon features. Since we are converging to the feature list of XML/WSDL with JSON, what makes you think that some JSON (i'm using JSON as placeholder for the next big thing in the JSON space) parsing libraries won't have bugs? Isn't that the danger of the added complexity?
So, most likely some of those uncommon features won't be added to JSON but why didn't "we" as a community then not "just" invent "XML light" or disable those dangerous but sometimes used features behind better defaults?
So, over the years all i've seen is some mumble jumble over how XML is too complicated and too noisy and big, atleast i used to say that, but what if that complexity came from decades of usage, from real requirements? Why wouldn't we arrive in the same situation 10 or 20 years from now? And if size matters so much, why do we ship megabytes of code for simple websites? And why didn't we choose protobuffers over JSON?
And what makes you think that the next generation won't think that JSON is too complicated (because naturally, complexity will increase with new requirements and features). So, i suppose we'll see a JSON replacement with different brackets ("TOML Schema"?) in the future. Or maybe it's time for a binary representation again, just to be replaced with the next "editor friendly and humanly readable" format.
It all feels like, we as an industry took a step backward for years (instead of improving the given tools and specs) just to head to the same level of complexity all the while throwing away lots and lots of mature tooling of XML processing.
P.S.: Same goes probably for ASN.1 vs. XML Schemas... So now we are in the 3rd generation of schema based data representation: ASN.1 -> XML -> JSON..
P.P.S.: I'm not arguing that we all should use XML now, i'm reflecting over the historical course and what might be ahead of us. Clearly, XML is declining steadily and has "lost".
Yeah, expressing complex types with only primitives in a portable way is still unfortunately a challenge. For example, how do we encode a datetime without ISO8601 and a schema definition; or, how do we encode complex numbers with units like "1j+1"; or "1.01 USD/meter"?
Fortunately, we can use XSD with RDFS and RDFS with JSON-LD.
LDP: Linked Data Platform and Solid: social linked data (and JSON-LD) are the newer W3C specs for HTTP APIs.
For one, pagination is a native feature of LDP.
JSON-LD is just the same old RDF data model that the W3C promotes unsuccessfully for that last 2 decades, repackaged with trendy JSON syntax. It's almost as unreadable as the XML serialization and still put very little value on the table. It shines however with people who like to do over-engineering instead of delivering a product.
It's astounding how often people make claims like this.
There is a whole lot of RDF Linked Data; and it links together without needing ad-hoc implementations of schema-specific relations.
I'll just link to the Linked Open Data Cloud again, for yet another hater that's probably never done anything for data interoperability: https://lod-cloud.net/
That's a success.
the only thing I miss are the comments
JSON5 supports comments: https://json5.org/
But who supports json5?
You can use json5 for comments and compile it to json.
You can use (yaml, hjson, jsonnet) for the same purpose. What sets json 5 apart?
YAML, hjson and jsonnet do not validate existing json blobs. You can "use" ROT13 for the same purpose, but it's incidental in what you are trying to do.
I would plead that everyone stop using YAML. It's terrible at everything.
jsonnet is a template language for a serialization format. Who would choose that nightmare?
Ecmascript isn't big on extending the language orthogonally, so hjson is eventually going to be superceded by a ES*.
This is a niche concern that has an optimal path. Go with a validation schema designed for applying to a serialization format, which has widespread library support.
>YAML ... do not validate existing json blobs
YAML 1.2 is a strict superset of JSON. What do you mean by "validate" here?
>I would plead that everyone stop using YAML. It's terrible at everything.
Fair enough.
>jsonnet is a template language for a serialization format. Who would choose that nightmare?
People who are tired of Jinja2 and YAML but don't want to jump to a general purpose programming language?
>widespread library support
JSON5 is implemented in Javascript, Go, and C#. Not sure how "widespread" that is. Rust, C, Python, Lua, Java, and Haskell are missing out on the fun.
Git for Node.js and the browser using libgit2 compiled to WebAssembly
This looks useful. Are there pending standards for other browser storage mechanisms than an in-memory FS?
Would it be a security risk to grant limited local filesystem access by domain; with a storage quota?
... To answer my own question, it looks like the FileSystem API is still experimental and only browser extensions can request access to the actual filesystem: https://developer.mozilla.org/en-US/docs/Web/API/FileSystem
Actually, looks like emscripten has support for a few different options, including IndexedDB[1]. I have a use case where we'd like to both use the local filesystem via Node APIs as well as in-memory in the browser. I asked the author of wasm-git and it looks like this is possible with a custom build[2].
[1]: https://emscripten.org/docs/api_reference/Filesystem-API.htm...
Scientists use ML to find an antibiotic able to kill superbugs in mice
> That is an especially pressing challenge in the development of new antibiotics, because a lack of economic incentives has caused pharmaceutical companies to pull back from the search for badly needed treatments. Each year in the U.S., drug-resistant bacteria and fungi cause more than 2.8 million infections and 35,000 deaths, with more than a third of fatalities attributable to C. diff, according to the the Centers for Disease Control and Prevention.
How big does the market have to be to commercially viable for research and development? Nearly 3M potential patients at a couple hundred dollars per course is nearing a $1B/year.
The second-order costs avoided by treatments developed so innovatingly could be included in a "value to society" estimation.
"Acknowledgements" lists the grant funders for this federally-funded open access study.
"A Deep Learning Approach to Antibiotic Discovery" (2020) https://doi.org/10.1016/j.cell.2020.01.021
> Mutant generation
> Chemprop code is available at: https://github.com/swansonk14/chemprop
> Message Passing Neural Networks for Molecule Property Prediction
> A web-based version of the antibiotic prediction model described herein is available at: http://chemprop.csail.mit.edu/
> This website can be used to predict molecular properties using a Message Passing Neural Network (MPNN). In order to make predictions, an MPNN first needs to be trained on a dataset containing molecules along with known property values for each molecule. Once the MPNN is trained, it can be used to predict those same properties on any new molecules.
Shit – An implementation of Git using POSIX shell
Hiya HN. I was ranting on Mastodon earlier today because I feel like people learn git the wrong way - from the outside in, instead of the inside out. I reasoned that git internals are pretty simple and easy to understand, and that the supposedly obtuse interface makes a lot more sense when you approach it with an understanding of the fundamentals in hand. I said that the internals were so simple that you could implement a workable version of git using only shell scripts inside of an afternoon. So I wrapped up what I was working on and set out to prove it.
Five hours later, it had turned into less of a simple explanation of "look how simple these primitives are, we can create them with only a dozen lines of shell scripting!" and more into "oh fuck, I didn't realize that the git index is a binary file format". Then it became a personal challenge to try and make it work anyway, despite POSIX shell scripts clearly being totally unsuitable for manipulating that kind of data.
Anyway, this is awful, don't use it for anything, don't read the code, don't look at it, just don't.
> the supposedly obtuse interface makes a lot more sense when you approach it with an understanding of the fundamentals in hand.
Agreed. I always said the best git tutorial is https://www.sbf5.com/~cduan/technical/git/
> The conclusion I draw from this is that you can only really use Git if you understand how Git works. Merely memorizing which commands you should run at what times will work in the short run, but it’s only a matter of time before you get stuck or, worse, break something.
I never really understood Git until I read this tutorial: https://github.com/susam/gitpr
Things began to click for me as soon as I read this in its intro section:
> Beginners to this workflow should always remember that a Git branch is not a container of commits, but rather a lightweight moving pointer that points to a commit in the commit history.
A---B---C
↑
(master)
> When a new commit is made in a branch, its branch pointer simply moves to point to the last commit in the branch. A---B---C---D
↑
(master)
> A branch is merely a pointer to the tip of a series of commits. With this little thing in mind, seemingly complex operations like rebase and fast-forward merges become easy to understand and use.This "moving pointer" model of Git branches led me to instant enlightenment. Now I can apply this model to other complicated operations too like conflict resolution during rebase, interactive rebase, force pushes, etc.
If I had to select a single most important concept in Git, I would say it is this: "A branch is merely a pointer to the tip of a series of commits."
And you can see this structure if you add to any "git log" command "--graph --oneline --decorate --color". IIRC some of those are unnecessary in recent versions of git, I just remember needing all of them at the point I started using it regularly.
I have a bash function for it (with a ton of other customizations, but it boils down to this):
function pwlog() {
git log "$@" --graph --oneline --decorate --color | less -SEXIER
}
pwlog --all -20
(...in that "less" command, "S" truncates instead of wraps lines, one "E" exits at EOF, "X" prevents screen-clearing, and "R" is to keep the color output. The second "E" does nothing special, it and "I" (case-insensitive search) are just to complete the word)You can also set $GIT_PAGER/core.pager/$PAGER and create an alias to accomplish this:
#export PAGER='less -SEXIER'
#export GIT_PAGER='less -SEXIER'
git config --global core.pager 'less -SEXIER'
git config --global alias.l 'log --graph --oneline --decorate --color'
# git diff ~/.gitconfig
git l
core.pager: https://git-scm.com/docs/git-config#Documentation/git-config...> The order of preference is the $GIT_PAGER environment variable, then core.pager configuration, then $PAGER, and then the default chosen at compile time (usually less).
HTTP 402: Payment Required
The RFC offers no reasoning for this besides
> The 402 (Payment Required) status code is reserved for future use.
Isn't this obviously domain-specific and something that should not be part of the transfer protocol?
> The new W3C Payment Request API [4] makes it easy for browsers to offer a standard (and probably(?) already accessible) interface for the payment data entry screen, at least. https://www.w3.org/TR/payment-request/
Salesforce Sustainability Cloud Becomes Generally Available
> - Reduce emissions with trusted analytics from a trusted platform. Analyzing carbon emissions from energy usage and company travel can be daunting and time-consuming. But with all your data flowing directly onto one platform, you can efficiently quantify your carbon footprint. Formulate a climate action plan for your company from a single source of truth, built on our trusted and secure data platform.
> - Take action with data-driven insights. Prove to customers, employees, and potential investors your commitment to carbon-conscious and sustainable practices. Offer regulatory agencies a clear snapshot of your energy usage patterns. Extrapolate energy consumption and track carbon emissions with cutting-edge analytics — and take action.
> - Tackle carbon accounting audits in weeks instead of months. Carbon analysis can be an overwhelming time commitment, even a barrier to action for companies that want to get it right. Use preloaded datasets from the U.S. EPA, IPCC, and others to accurately assess your carbon accounting. Streamline your data gathering and climate action plan with embedded guides and user flows.
> - Empower decision makers with executive-ready dashboard data. Evaluate corporate environmental impact with rich data visualization and dashboards. Track energy patterns and emission trends, then make the business case to executives. Once an organization understands its carbon footprint, decision makers can begin to drive sustainability solutions.
Are there similar services for Sustainability Reporting and accountability? https://en.wikipedia.org/wiki/Sustainability_reporting
Httpx: A next-generation HTTP client for Python
> Fully type annotated.
This is a huge win compare to requests. AFAIKT requests is too flexible (read: easier to misuse) and difficult to add type annotation now. The author of requests gave a horrible type annotation example here [0].
IMO at this time when you evaluate a new Python library before adopting, "having type annotation" should be as important as "having decent unit test coverage".
The huge win against requests is that Https is fully async.. you can download 20 files in parallel without not too much effort. Throughout in python asyncio is amazing, something similar to node or perhaps better... That's the main point.
FWIW, requests3 has "Type-annotations for all public-facing APIs", asyncio, HTTP/2, connection pooling, timeouts, etc https://github.com/kennethreitz/requests3
Sorry, have you checked the source? Are these features there or only announced? Has requests added a timeout by default finally?
It looks like requests is now owned by PSF. https://github.com/psf/requests
But IDK why requests3 wasn't transferred as well, and why issues appear to be disabled on the repo now.
The docs reference a timeout arg (that appears to default to the socket default timeout) for connect and/or read https://3.python-requests.org/user/advanced/#timeouts
And the tests reference a timeout argument. If that doesn't work, I wonder how much work it would be to send a PR (instead of just talking s to Ken and not contributing any code)
>But IDK why requests3 wasn't transferred as well, and
That's thing... Who knows..
TIL requests3 beta works with httpx as a backend: https://github.com/not-kennethreitz/team/issues/21#issuecomm...
If requests3 is installed, `import requests' imports requests3
BlackRock CEO: Climate Crisis Will Reshape Finance
+1. From the letter: https://www.blackrock.com/us/individual/larry-fink-ceo-lette...
> The money we manage is not our own. It belongs to people in dozens of countries trying to finance long-term goals like retirement. And we have a deep responsibility to these institutions and individuals – who are shareholders in your company and thousands of others – to promote long-term value.
> Climate change has become a defining factor in companies’ long-term prospects. Last September, when millions of people took to the streets to demand action on climate change, many of them emphasized the significant and lasting impact that it will have on economic growth and prosperity – a risk that markets to date have been slower to reflect. But awareness is rapidly changing, and I believe we are on the edge of a fundamental reshaping of finance.
> The evidence on climate risk is compelling investors to reassess core assumptions about modern finance. Research from a wide range of organizations – including the UN’s Intergovernmental Panel on Climate Change, the BlackRock Investment Institute, and many others, including new studies from McKinsey on the socioeconomic implications of physical climate risk – is deepening our understanding of how climate risk will impact both our physical world and the global system that finances economic growth.
Environmental, social and corporate governance > Responsible investment: https://en.wikipedia.org/wiki/Environmental,_social_and_corp...
Corporate social responsibility: https://en.wikipedia.org/wiki/Corporate_social_responsibilit...
UN-supported PRI: Principles for Responsible Investment (2,350 signatories (2019-04)) https://en.wikipedia.org/wiki/Principles_for_Responsible_Inv...
A lot of complex “scalable” systems can be done with a simple, single C++ server
Many developers severely underestimate how much workload can be served by a single modern server and high-quality C++ systems code. I've scaled distributed workloads 10x by moving them to a single server and a different software architecture more suited for scale-up, dramatically reducing system complexity as a bonus. The number of compute workloads I see that actually need scale-out is vanishingly small even in industries known for their data intensity. You can often serve millions of requests per second from a single server even when most of your data model resides on disk.
We've become so accustomed to extremely inefficient software systems that we've lost all perspective on what is possible.
Can you expand on this? I have some pretty massive compute loads that need to be scaled onto a cluster with 100+ workers for most computations. This is after I use a library called dask that graphically does its own mapreduce optimisation inside its modules. This is all for a relatively small 250GB raw data file that I keep in a csv (and need to convert to SQL at some point).
Are you saying this can be optimised to fit inside a single 10 core server in terms of compute loads?
Don't know why you're being downvoted but I'll assume your question is genuine.
You use a cluster when your data and compute requirements are large and parallel enough that the tax paid on network latency trumps the 10-20X speedup you get on SSD and 1000X speedup you get from just keeping data in RAM.
250 Gigs is tiny enough that you could probably much get better performance running on high memory instance in AWS or GCP. You'll generally have to write your own multiprocessing code though which is fairly simple - your existing library may also be able to support it.
I once actually ran this kind of workload on just my laptop using a compiled language that performed better than pyspark on a cluster.
I'd love to keep it in RAM if I could. The problem is, the library I'm familiar with (pandas) typically seems to take more memory than the original csv file once it loads it onto memory. I know this is due to bad data types but in certain cases, I cannot get around those.
However, even if I could load it all into memory at once, and assuming it takes 200 gb, I'm still using a master's student access to a cluster. So I get preempted like it's nobody's business. Hence why I prefer a smaller memory footprint even if I take up cpus at variable rates through a single execution.
I did try to write my own multiprocessing code for this, but the operations are sometimes too complicated (like groupby) for me to rewrite everything from the ground up. If I'm not reliant on serial data communication between processes (like you'd need to sort a column), I can get it done pretty easily. In fact, I wrote my data cleaning code with this and cleaned up the entire file in half an hour because single chunks didn't rely on others.
However, if you have some idea of how to run these computational loads in parallel in python or any other language on single compute instances (like the size of a laptop's memory of 16 gb), I'd really love to see it. Thanks.
Numpy supports memory mapping `ndarrays` which can back a DataFrame in pandas. This lets you access a dataset far larger than will fit in RAM as if it lived in RAM. Provided it's on fast SSD storage you'll have speedy access to the data and can process huge chunks at once.
Can you provide a link to this please? My current knowledge is that all numpy data lives in memory, and pandas itself has a feature to fragment any data into iterables so I can read upto my memory limit. I cannot use this feature due to the serial nature of some of the operations that I alluded to (I'd have to almost rewrite the entire library for some of these complicated operations like groupby and sorting).
I do have fast SSD storage because it's on the scratch drive of a cluster and from what I've seen it can do ~300-400 MB/s easily. I haven't had a chance to test more than that since I'm mostly memory constrained in much of my testing.
My current attempt is to push this data into a pure database handling system like SQL so that I can query it. But like I said, I work with a less-than-stellar set of tools and I have to literally set up a postgres server from ground up to write to it. Which shouldn't be a big deal except when it's on a non-root user and I have to keep remapping dependencies (took 5-6 hours to set it up on the instance I have access to).
My other option was to write the entire 250 GB to a sqllite database using the sqlalchemy library in Python, but that seems to fail whether I do it with parallel writes, or serial writes. In both cases, it fails after I create ~64-70 tables.
Dask groupby example: https://examples.dask.org/dataframes/02-groupby.html
> Generally speaking, Dask.dataframe groupby-aggregations are roughly same performance as Pandas groupby-aggregations, just more scalable.
The dask.distributed scheduler can also run on one high-RAM instance (with threads or processes) https://docs.dask.org/en/latest/setup.html
Pandas docs > Ecosystem > Out-of-core: https://pandas.pydata.org/pandas-docs/stable/ecosystem.html#...
Reading from Parquet into Apache Arrow is much faster than CSV because the data can just be directly loaded into RAM. https://ursalabs.org/blog/2019-10-columnar-perf/
If you have GPU instances, cuDF has a Pandas-like API on top of Apache Arrow. https://github.com/rapidsai/cudf
> Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.
> cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
Dask-ML makes scalable scikit-learn, XGBoost, TensorFlow really easy. https://dask-ml.readthedocs.io/en/latest/
... re: the OT: While it's possible to write C++ code that's really fast, it's generally inflexible, expensive to develop, and dangerous for devs with experience in their respective domains of experience to write. Much saner to put a Python API on top and optimize that during compilation.
There are a few C++ frameworks in the top quartile of the TechEmpower framework benchmarks. https://www.techempower.com/benchmarks/
Hardware/hosting is relatively cheap. Developers and memory vulnerabilities aren't.
Unfortunately, I'm not sure what's wrong with dask, but it doesn't work properly on my cluster. I tested it on an exceedingly simple operation - find all unique values in a very big column (5 billion rows, but I know for a fact that there are only 500-502 unique values in there). With a 100 workers, it still failed. Now this is an embarrassingly parallel operation that can be implemented trivially. So I'm not sure if there's a problem with my cluster or if dask just does not work with slurm clusters very well.
https://docs.dask.org/en/latest/setup/hpc.html says dask-jobqueue handles "PBS, SLURM, LSF, SGE and other resource managers"
"Dask on HPC, what works and what doesn't" https://github.com/dask/dask-blog/issues/5
Maybe you should spend some time developing a job visualization system for end users from scratch, for end users with lots of C, JS, and HTML experience https://jobqueue.dask.org/en/latest/interactive.html
Yes, I think the library still isn't ready for non HPC experts to use without tinkering. I don't have that level of expertise. I'm a data person who can handle working tools.
Warren Buffett is spending billions to make Iowa 'the Saudi Arabia of wind'
It's both cost-rational and environment-rational to invest heavily in clean energy (with or without the comparatively paltry tax incentives).
The long-term costs of climate change and inaction are unfortunately still mostly external costs to energy producers. We should expect that to change as we start developing competencies in evaluating the costs and frequency of weather disasters exacerbated by anthropogenic climate change. We all get to pay for floods, fires, tornados, hurricanes, landslides, blizzards, and the gosh darn heat.
Insurance firms clearly see these costs. Our military sees the costs of responding to natural disasters. Local economies see the costs of months and years spent on disaster relief; on just getting back up to speed so that they can generate profit from selling goods and services (and pay taxes to support disaster relief efforts essential to operational readiness).
The cost per kilowatt hour of wind (and solar) energy is now lower than operating existing dirty energy plants that dump soot on our crops, air, and water.
With wind, they talk about the "alligator curve". With solar, it's the "duck curve". Grid-scale energy storage is necessary for reaching 100% renewable energy as soon as possible.
Iowa's renewable energy tax incentives are logically aligned with international long-term goals:
UN Sustainable Development Goal 7: Affordable and Clean Energy https://www.globalgoals.org/7-affordable-and-clean-energy
Goal 13: Climate Action https://www.globalgoals.org/13-climate-action
SDG Target 12.6: "Encourage companies to adopt sustainable practices and sustainability reporting" (CSR; e.g. GRI Sustainability Reporting Standards that we can score portfolios with)
https://www.undp.org/content/undp/en/home/sustainable-develo... :
> Rationalize inefficient fossil-fuel subsidies that encourage wasteful consumption by removing market distortions, in accordance with national circumstances, including by restructuring taxation and phasing out those harmful subsidies, where they exist, to reflect their environmental impacts, taking fully into account the specific needs and conditions of developing countries and minimizing the possible adverse impacts on their development in a manner that protects the poor and the affected communities
...
> Thanks. How can I say "try and only run this [computational workload] in zones with 100% PPA offsets or 100% directly sourced #CleanEnergy"? #Goal7 #Goal11 #Goal12 #Goal13 #GlobalGoals #SDGs
It makes good business sense to invest in clean energy to take advantage of tax incentives, minimize future costs to other business units (e.g. insurance, taxes), and earn the support of investors choosing portfolios with long term environmental (and thus economic) sustainability as a primary objective.
Scientists Likely Found Way to Grow New Teeth for Patients
"Scientists Have Discovered a Drug That Fixes Cavities and Regrows Teeth" https://futurism.com/neoscope/scientists-have-discovered-thi...
Tideglusib https://en.wikipedia.org/wiki/Tideglusib
> Tooth repair mechanisms that promotes dentine reinforcement of a sponge structure until the sponge biodegrades, leaving a solid dentine structure. In 2016, the results of animal studies were reported in which 0.14 mm holes in mouse teeth were permanently filled.
Very interesting mechanism.
GSK3 inhibitors are interesting, but we don't understand much the subtypes that are likely to exist: "GSK-3 appears to both promote and inhibit apoptosis, and this regulation varies depending on the specific molecular and cellular context."
Still, I believe better GSK3i will find a role in autologous bone-marrow grafting therapies to fight senescence: with TERC/TERT overexpression (the opposite of DeGrey WILT ideas) to send new stem cells to the tissues - just like how you can find the donors chromosomes in most tissues of grafted patient.
Announcing the New PubMed
I'm actually part of the team developing the new PubMed. Very curious and interested to know what the hacker news community thinks and feels about their experience. https://pubmed.gov/labs
This looks great: I like the search timeline, the ability to easily search for free full-text meta-analyses (a selection bias we should all be aware of), the MeSH term listing in a reasonably-sized font, and that there's schema.org/ImageObject metadata within the page, but there's no [Medical]ScholarlyArticle metadata?
I've worked with Google Scholar (:o) [1], Semantic Scholar (Allen Institute for AI) [2], Meta (Chan Zuckerberg Institute) [3], Zotero, Mendeley and a number of other tools for indexing and extracting metadata and graph relations from https://schema.org/ScholarlyArticle and MedicalScholarlyArticles . Without RDFa (or Microdata, or JSON-LD) in PDF, there's a lot of parsing that has to go down in order to get a graph from the citations in the article. Each service adds value to this graph of resources. Pushing forward on publishing linked research that's reproducible (#LinkedResearch, #LinkedReproducibility) is a worthwhile investment in meta-research that we have barely yet addressed:
> http://Schema.org/NewsArticle .citation: https://schema.org/citation ... Wouldn't it be great if NewsArticles linked to the ScholarlyArticle and/or Notebook CreativeWorks that they're .about (with reified relations)?
> A practical use case: Alice wants to publish a ScholarlyArticle [1] (in HTML with structured data, as a PDF) predicated upon Datasets [2] (as CSV, CSVW JSONLD, XLSX (DataDownload)) with static HTML (and no special HTTP headers). 1 https://schema.org/ScholarlyArticle 2 https://schema.org/Dataset*
> B wants to build a meta analysis: to collect a # of ScholarlyArticles and Dataset DataDownloads; review study controls and data; merge, join, & concatenate Datasets if appropriate, and inductively or deductively infer a conclusion and suggestions for further studies of variance*
The Linked Open Data Cloud shows the edges, the relations, the structured data links between very many (life sciences) datasets: https://lod-cloud.net/ . https://5stardata.info/en/ lists TimBL's suggested 5-start deployment schema for Open Data; which culuminates in publishing linked open data in non-proprietary formats that uses URIs to describe and link to things.
Could any of these [1][2][3][4][5] services cross-link the described resources, given a common URI identifier such as https://schema.org/identifier and/or https://schema.org/url ? ORCID is a service for generating stable identifiers for researchers and publishers who have names in common but different emails. W3C DID solves for this need in a different way.
When I check an article result page with the OpenLink OSDS extension (or any of a number of other tools for extracting structured data from HTML pages (and documents!) https://github.com/CodeForAntarctica/codeforantarctica.githu... ), there could be quite a bit more data there for search engines, browser extensions, and meta-research tools.
Is this something like ElasticSearch on the backend? It is possible to store JSON-LD documents in the search index. I threw together elasticsearchjsonld to "Generate JSON-LD @contexts from ElasticSearch JSON Mappings" for the OpenFDA FAERS data a few year ago. That's not GraphQL or SPARQL, but it's something and it's Linked Data.
re: "Canada's Decision To Make Public More Clinical Trial Data Puts Pressure On FDA" https://news.ycombinator.com/item?id=21232183
> We really could get more out of this data through international collaboration and through linked data (e.g. URIs for columns). See: "Open, and Linked, FDA data" https://github.com/FDA/openfda/issues/5#issuecomment-5392966... and "ENH: Adverse Event Count / 'Use' Count Heatmap" https://github.com/FDA/openfda/issues/49 . With sales/usage counts, we'd have a denominator with which we could calculate relative hazard.
W3C Web Annotations handle threaded comments and highlights; reviewing the reviewers is left as an exercise for the reader. Does Zotero still make it easy to save the bibliographic metadata for one or more ScholarlyArticles from PubMed to a collection in the cloud (and add metadata/annotations)?
Sorry to toot my own horn here. Great job on this. This opens up many new opportunities for research.
[1] https://scholar.google.com
[2] https://www.semanticscholar.org/
Pubmed publishes its dataset for download. Its rather large but update files come frequently. Its amazing. I beleive NIH adds the MESH terms.
ftp://ftp.ncbi.nlm.nih.gov/pubmed/
We had someone do a project with it. downloaded the dataset and used it and create a tool to do some searches that we found useful to find colaborators: (last author, working on a specific gene, paper counts, most recent).
Searching by Mesh Terms across species, and search with orthologs.
The dataset sometimes has a hard time disambiguating names (I think the european dataset assigns Ids to names)
To make sure your feedback is heard, please use "Feedback button" found in bottom right corner of https://pubmed.gov/labs
Ask HN: Is it worth it to learn C in 2020?
The GNU/Linux kernel, FreeBSD kernel, Windows kernel, MacOS kernel, Python, Ruby, Perl, PHP, NodeJS, and NumPy are all written in C. If you want to review and contribute code, you'd need to learn C.
There are a number of coding guidelines e.g. for safety-critical systems where bounded running time and resource consumption are essential. These coding guidelines and standards are basically only available for C, C++, and Ada. https://github.com/stanislaw/awesome-safety-critical/blob/ma...
Even though modern languages have garbage-collection that runs whenever it feels like it, It's helpful to learn about memory management in C (or C++). You'll appreciate object destructor methods that free memory and sockets and file handles that much more. Reference cycles in object graphs are easier to handle with modern C++ than with C. Are there RAII (Resource Acquisition is Initialization) "smart pointers" that track reference counts in C?
Without OO namespacing, in C, function names are often prefixed with namespaces. How many ways could a struct be initialized? When can I free that memory?
When strace prints a syscall, what is that?
Is it necessary to learn C? Somebody needs to maintain and improve the C-based foundation for most of our OSs and very many of our fancy scripting languages. C can be very unforgiving: it's really easy to do it wrong, and there's a lot to keep in mind at once: the cognitive burden is higher with C (and then still with ASM and WebASM) than with an interpreted (or compiled) duck-typed 3GL scripting language with first-class functions.
What's a good progression that includes syntax, finding and reading the libc docs, Make/CMake/Autotools, secure recommended compiler flags for GCC (CPPFLAGS, CFLAGS, LDFLAGS) and LLVM Clang?
C: https://learnxinyminutes.com/docs/c/
C++: https://learnxinyminutes.com/docs/c++/
Links to the docs for Libc and other tools: https://westurner.github.io/tools/#libc
xeus-cling is a Jupyter kernel for C++ (and most of C) that works with nbgrader. https://github.com/QuantStack/xeus-cling
What's a better unit-testing library for C/C++ than gtest? https://github.com/google/googletest/
Amazing comment, thank you for your detailed input and the awesome resources. I want to get into C because I like network and system programming but lack the underlying knowledge and experience. These links will help me start on that new path. Very much appreciated sir :)
For network programming, you might consider asynchronous programming with coroutines. C++20 has them and they're already supported in LLVM. For C, there are a number of implementations of coroutines: https://en.wikipedia.org/wiki/Coroutine#Implementations_for_...
> Once a second call stack has been obtained with one of the methods listed above, the setjmp and longjmp functions in the standard C library can then be used to implement the switches between coroutines. These functions save and restore, respectively, the stack pointer, program counter, callee-saved registers, and any other internal state as required by the ABI, such that returning to a coroutine after having yielded restores all the state that would be restored upon returning from a function call. Minimalist implementations, which do not piggyback off the setjmp and longjmp functions, may achieve the same result via a small block of inline assembly which swaps merely the stack pointer and program counter, and clobbers all other registers. This can be significantly faster, as setjmp and longjmp must conservatively store all registers which may be in use according to the ABI, whereas the clobber method allows the compiler to store (by spilling to the stack) only what it knows is actually in use.
CPython's asyncio implementation (originally codenamed 'tulip') is written in C and IMHO much easier to use than callbacks like Twisted and JS before Promises and the inclusion of tulip-like async/await keywords in ECMAscript. Uvloop - based on libuv, like Node - is apparently the fastest asyncio event loop. CPython Asyncio C module source: https://github.com/python/cpython/blob/master/Modules/_async... Asyncio docs: https://docs.python.org/3/library/asyncio.html
(When things like file or network I/O are I/O bound, the program can yield to allow other asynchronous coroutines ('async') to run on that core. With network programming, we're typically waiting for things to send or reply.)
Return-oriented-programming > Return-into-library technique is an interesting read regarding system programming :) https://en.wikipedia.org/wiki/Return-oriented_programming#Re...
Free and Open-Source Mathematics Textbooks
This is a good list of books. Unfortunately many of the links are broken? Probably just my luck, but the first few "with Sage" books I excitedly selected unfortunately 404'd. I'll send an email.
> Moreover, the American Institute of Mathematics maintains a list of approved open-source textbooks. https://aimath.org/textbooks/approved-textbooks/
I also like the (free) Green Tea Press books: Think Stats, Think Bayes, Think DSP, Think Complexity, Modeling and Simulation in Python, Think Python 2e: How To Think Like a Computer Scientist https://greenteapress.com/wp/
And IDK how many times I've recommended the book for the OCW "Mathematics for Computer Science" course: https://ocw.mit.edu/courses/electrical-engineering-and-compu...
There may be a newer edition than the 2017 version of the book: https://courses.csail.mit.edu/6.042/spring17/mcs.pdf
Does anyone have a good strategy for archiving and downloading the textbooks from aimath.org? They are all excellently formatted in HTML, but I am not certain what the best way to get the complete book would be.
When I have books in form of webpages I normally write a small crawler in python, extract the text div with beautifulsoup, add <hn> tags for chapter names and throw them all together in html form. Add a cover image and combine everything with pandoc.
Nothing fancy but works reliable in an automated fashion
I've found MIT's "Mathematics for Computer Science" actually hard to read, and in my opinion it cannot really be called a proper book, more like a collection of notes, as it skips over many details, jumping to conclusions, definitions, results, without giving you an intuitive explanation. In that regards it's actually hard to follow.
It's fast-going. You may want to start with something more introductory, like:
Levin http://discrete.openmathbooks.org/dmoi3.html
Hammack https://www.people.vcu.edu/~rhammack/BookOfProof/
the early parts of Newstead https://infinitedescent.xyz/
Make CPython segfault in 5 lines of code
Applications Are Now Open for YC Startup School – Starts in January
I'm presently living in a fairly rural part of the US due to an illness in the family. In the town of 14,000 I currently reside in it's pretty difficult to network in a meaningful way and talk about my company with folks that can give guidance and feedback. Really excited for Startup School and so grateful that YC has put this together.
> In the town of 14,000 I currently reside in it's pretty difficult to network in a meaningful way and talk about my company with folks that can give guidance and feedback
GitLab and Zapier are examples of all remote former YC companies.
"GitLab Handbook" https://about.gitlab.com/handbook/
"The Ultimate Guide to Remote Work: Lessons from a team of over 200 remote workers" https://zapier.com/learn/remote-work/
Were those companies remote when they did YC? I can't recall exactly where offhand but I seem to remember Paul Graham (or was it Sam Altman?) saying that they were strongly against remote teams in the early stages. It was a few years ago though, I wonder if their opinions have changed.
Startup School is now designed as a remote program.
It'd be interesting to hear from them about building all remote team culture with transparency and accountability. Are text-chat "digital stand up meetings" with quality transcripts of each team member's responses to the three questions enough? ( Yesterday / Today and Tomorrow / Obstacles // What did I do since the last time we met? What will I do before the next time we meet? What obstacles are blocking my progress? )
Or are there longer term planning sessions focusing on a plan for delivering value on a far longer term than first getting the MVP down and maximizing marginal profit by minimizing costs?
‘Adulting’ is hard. UC Berkeley has a class for that
+1 for Life Skills for Adulting and also Home Economics including Family and Meal Planning.
A bunch of resources from "Consumer science (a.k.a. home economics) as a college major" https://news.ycombinator.com/item?id=17894632 : CS 007: Personal Finance for Engineers, r/personalfinance/wiki, Healthy Eating Plate, Khan Academy > Science > Health and Medicine
And also, Instant Pot. The Instant Pot pressure cooker is your key to nutrient preservation and ultimate happiness.
Founder came back after 8 years to rewrite flash photoshop in canvas/WebGL
Five cities account for vast majority of growth in U.S. tech jobs: study
> Boston, San Francisco, San Jose, Seattle, and San Diego—accounted for more than 90% of the nation’s innovation-sector growth during the years 2005 to 2017
Why 2005 and why'd it drop off post-2017?
Also surprised DC and NYC aren't on the list. Lots of new companies and growth in those areas, but I guess they may not qualify as "innovation-sector growth".
> To that end, the present paper proposes that Congress assemble and award to a select set of metropolitan areas a major package of federal innovation inputs and supports that would accelerate their innovation-sector scale-up. Along these lines, we envision Congress establishing a rigorous competitive process by which the most promising eight to 10 potential growth centers would receive substantial financial and regulatory support for 10 years to become self-sustaining new innovation centers. Such an initiative would not only bring significant economic opportunity to more parts of the nation, but also boost U.S. competitiveness on the global stage.
"Potential growth centers" sounds promising.
Don’t Blame Tech Bros for the Housing Crisis
If there is demand for housing, we would expect people to be finding land and building housing unless there are policies that prevent this (and/or long commutes that people don't want to suffer) or higher-value opportunities.
If the city wanted residential areas (over commercial tax revenue giants), the city should have zoned residential.
The people elect city leaders. The people all want affordable housing.
With $4.5b from corporations and nowhere to build but out or up, high rise residential is the most likely outcome. (Which is typical for dense urban areas that have prioritized and attracted corporate tax revenue over affordable housing)
... Effing scooter bros with their scooters and their gold rush money and their tiny houses.
[Edit: more than] One company says "I will pay you $10,000 to leave the Bay Area / Silicon Valley" Because there's a lot of tech talent (because universities and opportunities) but ridiculously high expenses.
What an effectual headline from NY.
Docker is just static linking for millenials
No, LXC does quite a bit more than static linking. An inability to recognize that likely has nothing to do with generation.
Can you launch a process in a chroot, with cgroups? Okay, now upgrade everything it's linked with (without breaking the host/build system)
Configure a host-only network for a few processes – running in separate cgroups – without DHCP.
Criticize Docker? Rootless builds and containers are essentially impossible. Buildah and podman make rootless builds possible without a socket. Like sysvinit, though, IDK how well centralized logging (and logshipping, and logged crashes and restarts) works without that socket.
Given comments like this, it's likely that you've never built a chroot for a different distro. Or launchd a process with cgroups.
Show HN: Bamboolib – A GUI for Pandas (Python Data Science)
This looks excellent. The ability to generate the Python code for the pandas dataframe transformations looks to be more useful than OpenRefine, TBH.
How much work would it be to use Dask (and Dask-ML) as a backend?
I see the OneHotEncoder button. Have you considered integration with Yellowbrick? They've probably already implemented a few of your near-future and someday roadmap items involving hyperparameter selection and model selection and visualization? https://www.scikit-yb.org/en/latest/
This video shows more of the advanced bamboolib features: https://youtu.be/I0a58h1OCcg
The live histogram rebinning looks useful. Recently I read about a 'shadowgram' / ~KDE approach with very many possible bin widths translucently overlaid in one chart. https://stats.stackexchange.com/questions/68999/how-to-smear...
Yellowbrick also has a bin width optimization visualization in yellowbrick.target.binning.BalancedBinningReference: https://www.scikit-yb.org/en/latest/api/target/binning.html
Great work.
Thank you for your feedback and support :) Are you currently using OpenRefine?
We are currently thinking about providing other dataframe libraries like dask or pyspark and similar. However, we are a little bit unsure on how to make sure that there is user demand before we implement it. It is not a complete rewrite but it would require some additional abstractions at some points in the library. And we need to check if some features might not be available any more. Would dask support be a reason to buy for you?
Great hint with yellowbrick and yes, we are considering some of those features as well if there is a useful place in the library.
In general, we are also thinking about ways how you can extend the library for yourself so that you can add your own analyses/charts of choice and then they will come up again the right point in time. In case that this is useful.
In the past, I've looked at OpenRefine and Jupyter integration. Once I've learned to do data transformation with pandas and sklearn with code, I'll report back to you.
Pandas-profiling has a number of cool descriptive statistics features as well. https://github.com/pandas-profiling/pandas-profiling
There's a new IterativeImputer in Scikit-learn 0.22 that it'd be cool to see visualizations of. https://twitter.com/TedPetrou/status/1197150813707108352 https://scikit-learn.org/stable/modules/impute.html
A plugin model would be cool; though configuring the container every time wouldn't be fun. Some ideas about how we could create a desktop version of binderhub in order to launch REES-compatible environments on our own resources: https://github.com/westurner/nbhandler/issues/1
Dask has only a subset of Pandas available.
Could you send me a link to the docs where they say which ones are not included in Pandas? Would love to take a closer look at his.
Set difference and/or intersection of dir(pd.DataFrame) and dir(dask.DataFrame) with inspect.getargspec and inspect.doc would be a useful document for either or both projects.
pyfilemods generates a ReStructuredText document with introspected API comparisons. "Identify and compare Python file functions/methods and attributes from os, os.path, shutil, pathlib, and path.py" https://github.com/westurner/pyfilemods
Battery-Electric Heavy-Duty Equipment: It's Sort of Like a Cybertruck
> They’ve created a single platform that can be easily modified to do any number of jobs. For instance, their flagship product, the Dannar 4.00, can accept over 250 attachments from CAT, John Deere, or Bobcat. […] Having interoperability with so many different types of equipment, one platform can easily perform many tasks over the course of a year. This is a huge win for cash strapped municipalities. Why would a company or municipality opt to have a backhoe parked all winter long when it could be doing another job?
Does it have regenerative brakes?
Tools for turning descriptions into diagrams: text-to-picture resources
There's a sphinx extension (sphinxcontrib-sdedit) [1] for sdedit ("Quick Sequence Diagram Editor") which does require Java [2].
CSR: Corporate Social Responsibility
> Proponents argue that corporations increase long-term profits by operating with a CSR perspective, while critics argue that CSR distracts from businesses' economic role.
... The 3 Pillars of Corporate Sustainability: Environmental, Social, Economic https://www.investopedia.com/articles/investing/100515/three...
Three dimensions of sustainability: (Environment (Society (Economy))) https://en.wikipedia.org/wiki/Sustainability#Three_dimension...
What are some of the corporate sustainability reporting standards?
How can I score a candidate portfolio with sustainability metrics in order to impact invest with maximum impact?
> What are some of the corporate sustainability reporting standards?
From https://en.wikipedia.org/wiki/Sustainability_reporting#Initi... :
>> Organizations can improve their sustainability performance by measuring (EthicalQuote (CEQ)), monitoring and reporting on it, helping them have a positive impact on society, the economy, and a sustainable future. The key drivers for the quality of sustainability reports are the guidelines of the Global Reporting Initiative (GRI),[3] (ACCA) award schemes or rankings. The GRI Sustainability Reporting Guidelines enable all organizations worldwide to assess their sustainability performance and disclose the results in a similar way to financial reporting.[4] The largest database of corporate sustainability reports can be found on the website of the United Nations Global Compact initiative.
The GRI (Global Reporting Initiative) Standards are now aligned with the UN Sustainable Development Goals (#GlobalGoals). https://en.wikipedia.org/wiki/Global_Reporting_Initiative
>> In 2017, 63 percent of the largest 100 companies (N100), and 75 percent of the Global Fortune 250 (G250) reported applying the GRI reporting framework.[3]
> How can I score a candidate portfolio with sustainability metrics in order to impact invest with maximum impact?
Does anybody have solutions for this? AFAIU, existing cleantech funds are more hand-picked than screened according to sustainability fundamentals.
GTD Tickler file – a proposal for text file format
Taskwarrior is also built upon the todo.txt format. [1]
Taskw supports various task dates – { due: scheduled: wait: until: recur: } [2]
Taskw supports various named dates like soq/eocq, som/eom (start/end of [current] quarter, start/end of month), tomorrow, later [3]
Taskw recurring tasks (recur:) use the duration syntax: weekly/wk/w, monthly/mo, quarterly/qtr, yearly/yr, … [4]
Pandas has a "date offset" "frequency string" microsyntax that supports business days, quarters, and years; e.g. BQuarterEnd, BQuarterBegin [5]
IDK how usable by other tools these date string parsers are.
W/ just a text editor, having `todo.txt`, `daily.todo.txt`, and `weekly.todo.txt` (and `cleanhome.todo.txt` and `hygiene.todo.txt` with "## heading" tasks that get lost @where +sorting) works okay.
I have physical 43 folders, too: A 12 month and a 31 day expanding file. [6]
[2] https://taskwarrior.org/docs/using_dates.html
[3] https://taskwarrior.org/docs/named_dates.html
[4] https://taskwarrior.org/docs/durations.html
[5] https://pandas.pydata.org/pandas-docs/stable/user_guide/time...
Ask HN: Any suggestion on how to test CLI applications?
Hello HN!
I've been looking at alternatives on how to test command line applications, specifically, for example, exit codes, output messages and whatnot. I've seen "bats" https://github.com/sstephenson/bats and Bazel for testing but I'm curious as what other tools people use in a day to day basis. UI testing is nice with tools like Cypress.io and maybe there's something out there that isn't as popular but it's useful.
Thoughts?
pytest-docker-pexpect: https://github.com/nvbn/pytest-docker-pexpect
Pexpect: https://pexpect.readthedocs.io/en/stable/
pytest with subprocess.popen (or Sarge) may be sufficient for checking return codes and checking stdout and stderr output streams. Pytest has tmp_path and tmpdir fixtures that provide less test isolation than Docker containers: http://doc.pytest.org/en/latest/tmpdir.html
sarge.Capture.expect() takes a regex and returns None if there's no match: https://sarge.readthedocs.io/en/latest/tutorial.html#looking...
The Golden Butterfly and the All Weather Portfolio
The Golden Butterfly (is a modified All Weather Portfolio)
> Stocks: 20% Domestic Large Cap Fund (Vanguard’s VTI or Goldman Sach’s JUST), 20% Domestic Small Cap Value (Vanguard’s VBR)
> Bonds: 20% Long Term (Vanguard’s BLV), 20% Short Term (Vanguard’s BSV)
> Real Assets: 20% Gold (SPDR’s GLD)
The All Weather Portfolio:
> Stocks: 30% Domestic Total Stock Market (VG total stock)
> Bonds: 40% Long Term, 15% Intermediate-Term
> Real Assets: 7.5% Commodities, 7.5% Gold
Canada's Decision To Make Public More Clinical Trial Data Puts Pressure On FDA
We really could get more out of this data through international collaboration and through linked data (e.g. URIs for columns). See: "Open, and Linked, FDA data" https://github.com/FDA/openfda/issues/5#issuecomment-5392966... and "ENH: Adverse Event Count / 'Use' Count Heatmap" https://github.com/FDA/openfda/issues/49
With sales/usage counts, we'd have a denominator with which we could calculate relative hazard.
Python Alternative to Docker
Shiv does not solve for what containers and Docker/Podman/Buildah/Containerd solve for: re-launching processes at boot and failure, launching processes in chroots or cgroups (with least privileges), limiting access to network ports, limiting access to the host filesystem, building chroots / images, [...]
You can run build tools like shiv with a RUN instruction in a Dockerfile and get some caching.
You can build a zipapp with shiv (in a build container) and run the zipapp in a container.
Should the zipapp contain the test suite(s) and test_requires so that the tests can be run in an environment most similar to production?
It's much easier to develop with code on the filesystem (instead of in a zipapp).
It's definitely faster to read the whole zipapp into RAM than to stat and read each imported module from the filesystem once at startup.
There may be a better post title than the current "Python Alternative to Docker"? Shiv is a packaging utility for building Python zipapps. Shiv is not an alternative to process isolation with containers (or VMs)
$6B United Nations Agency Launches Bitcoin, Ethereum Crypto Fund
"UNICEF launches Cryptocurrency Fund: UN Children’s agency becomes first UN Organization to hold and make transactions in cryptocurrency" https://www.unicef.org/press-releases/unicef-launches-crypto...
From https://www.unicefusa.org/ :
> UNICEF USA helps save and protect the world's most vulnerable children. UNICEF USA is rated one of the best charities to donate to: 89% of every dollar spent goes directly to help children.
Timsort, the Python sorting algorithm
Here are the Python 3 docs for sorting [1], in-place list.sort() [2], and sorted() [3] (which makes a sorted copy of the references). And the Timsort Wikipedia page [4].
[1] https://docs.python.org/3/howto/sorting.html#sort-stability-...
[2] https://docs.python.org/3/library/stdtypes.html#list.sort
Supreme Court allows blind people to sue retailers if websites aren't accessible
"a11y": Accessibility
https://a11yproject.com/ has patterns, a checklist for checking web accessibility, resources, and events.
awesome-a11y has a list of a number of great resources for developing accessible applications: https://github.com/brunopulis/awesome-a11y
In terms of W3C specifications [1], you've got: WAI-ARIA (Web Accessibility Initiative: Accessibile Rich Internet Applications) [2], and WCAG: Web Content Accessibility Guidelines [3]. The new W3C Payment Request API [4] makes it easy for browsers to offer a standard (and probably(?) already accessible) interface for the payment data entry screen, at least.
There are a number of automated accessibility testing platforms. "[W3C WAI] Web Accessibility Evaluation Tools List" [5] lists quite a few. Can someone recommend a good accessibility testing tools? Is Google Lighthouse (now included with Chrome Devtools and as a standalone script) a good tool for accessibility reviews?
[1] https://github.com/brunopulis/awesome-a11y/blob/master/topic...
[2] https://www.w3.org/TR/using-aria/
[3] https://www.w3.org/WAI/standards-guidelines/wcag/
Streamlit: Turn a Python script into an interactive data analysis tool
Cool!
requests_cache caches HTML requests into one SQLite database. [1] pandas-datareader can cache external data requests with requests-cache. [2]
dask.cache can do opportunistic caching (of 2GB of data). [3]
How does streamlit compare to jupyter voila dashboards (with widgets and callbacks)? They just launched a new separate github org for the project. [4] There's a gallery of voila dashboard examples. [5]
> Voila serves live Jupyter notebooks including Jupyter interactive widgets.
> Unlike the usual HTML-converted notebooks, each user connecting to the Voila tornado application gets a dedicated Jupyter kernel which can execute the callbacks to changes in Jupyter interactive widgets.
> - By default, voila disallows execute requests from the front-end, preventing execution of arbitrary code.
[1] https://github.com/reclosedev/requests-cache
[2] https://pandas-datareader.readthedocs.io/en/latest/cache.htm...
[3] https://docs.dask.org/en/latest/caching.html
[4] https://github.com/voila-dashboards/voila
[5] https://blog.jupyter.org/a-gallery-of-voil%C3%A0-examples-a2...
Acess control and resource exhaustion are challenges with building any {Flask, framework_x,} app [from Jupyter notebooks]. First it's "HTTP Digest authentication should be enough for now"; then it's "let's use SSO and LDAP" (and review every release); then it's "why is it so sloww?". JupyterHub has authentication backends, spawners, and per-user-container/vm resource limits.
> Each user on your JupyterHub gets a slice of memory and CPU to use. There are two ways to specify how much users get to use: resource guarantees and resource limits. [6]
[6] https://zero-to-jupyterhub.readthedocs.io/en/latest/user-res...
Some notes re: voila and JupyterHub:
> The reason for having a single instance running voila only is to allow non JupyterHub users to have access to the dashboards. So without going through the Hub auth flow.
> What are the requirements in your case? Voila can be installed in the single user Docker image, so that each user can also use it on their own server (as a server extension for example). [7]
Scott’s Supreme Quantum Supremacy FAQ
Who even asked these questions?
I question this. All of this.
Literally people on HN have been asking many of these questions for years whenever QC is discussed.
Ask HN: How do you handle/maintain local Python environments?
I'm having some trouble figuring out how to handle my local Python. I'm not asking about 2 vs 3 - that ship has sailed - I'm confused on which binary to be using. From the way I see it, there's at least 4 different Pythons I could be using:
1 - Python shipped with OS X/Ubuntu
2 - brew/apt install python
3 - Anaconda
4 - Getting Python from https://www.python.org/downloads/
And that's before getting into how you get numpy et al installed. What's the general consensus on which to use? It seems like the OS X default is compiled with Clang while brew's version is with GCC. I've been working through this book [1] and found this thread [2]. I really want to make sure I'm using fast/optimized linear algebra libraries, is there an easy way to make sure? I use Python for learning data science/bioinformatics, learning MicroPython for embedded, and general automation stuff - is it possible to have one environment that performs well for all of these?
[1] https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
[2] https://www.reddit.com/r/Python/comments/46r8u0/numpylinalgsolve_is_6x_faster_on_my_mac_than_on/
I just use Anaconda. It's basically the final word in monolithic Python distributions.
For data science/numerical computation, all batteries included. It also has fast optimized linear algebra (MKL) plus extras like dask (paralellization), numba, out of the box. No fuss no muss. No need to fiddle with anything.
Everything else is a "pip install" or "conda install" away. Virtual envs? Has it. Run different Python versions on the same machine via conda environments? Has it. Web dev with Django etc.? All there. Need to containerize? miniconda.
The only downside? It's quite big and takes a while to install. But it's a one time cost.
I also prefer conda for the same reasons.
Precompiled MKL is really nice. Conda and conda-forge now build for aarch64. There are very few wheels for aarch64 on PyPI. Conda can install things like Qt (IPython-qt, spyder,) and NodeJS (JupyterLab extensions).
If I want to switch python versions for a given condaenv (instead of just creating a new condaenv for a different CPython/PyPy version), I can just run e.g. `conda install -y python=3.7` and it'll reinstall everything in the depgraph that depended on the previous python version.
I always just install miniconda instead of the whole anaconda distribution. I always create condaenvs (and avoid installing anything in the root condaenv) so that I can `conda-env export -f environment.yml` and clean that up.
BinderHub ( https://mybinder.org/ ) creates docker containers from {git repos, Zenodo, FigShare,} and launches them in free cloud instances also running JupyterLab by building containers with repo2docker (with REES (Reproducible Execution Environment Specification)). This means that all I have to do is add an environment.yml to my git repo in order to get Binder support so that people can just click on the badge in the README to launch JupyterLab with all of the dependencies installed.
REES supports a number of dependency specifications: requirements.txt, Pipfile.lock, environment.yml, aptSources, postBuild. With an environment.yml, I can install the necessary CPython/PyPy version and everything else.
...
In my dotfiles, I have a setup_miniconda.sh script that installs miniconda into per-CPython-version CONDA_ROOT and then creates a CONDA_ENVS_PATH for the condaenvs. It may be overkill because I could just specify a different python version for all of the conda envs in one CONDA_ENVS_PATH, but it keeps things relatively organized and easily diffable: CONDA_ROOT="~/-wrk/-conda37" CONDA_ENVS_PATH="~/-wrk/-ce37"
I run `_setup_conda 37; workon_conda|wec dotfiles` to work on the ~/-wrk/-ce37/dotfiles condaenv and set _WRD=~/-wrk/-ce37/dotfiles/src/dotfiles.
Similarly, for virtualenvwrapper virtualenvs, I run `WORKON_HOME=~/-wrk/-ve37 workon|we dotfiles` to set all of the venv cdaliases; i.e. then _WRD="~/-wrk/-ve37/dotfiles/src/dotfiles" and I can just type `cdwrd|cdw` to cd to the working directory. (Some of the other cdaliases are: {cdwrk, cdve|cdce, cdvirtualenv|cdv, cdsrc|cds}. So far, I have implemented cdalias support for bash, IPython, and vim)
One nice thing about defining _WRD is I can run `makew <tab>` and `gitw` to `cd $_WRD; make <tab>` and `git -C $_WRD` without having to change directory and then `cd -` to return to where I was.
So, for development, I use a combination of virtualenvwrapper, pipsi, conda, and some shell scripts in my dotfiles that I should get around to releasing and maintaining someday. https://westurner.github.io/dotfiles/venv
For publishing projects, I like environment.yml because of the REES support.
Is the era of the $100 graphing calculator coming to an end?
For $100, you can buy a Pinebook with an 11" or 14" screen, a multitouch trackpad, gigabytes of storage, WiFi, a keyboard without a numpad, and an ARM processor.
On this machine, you can create reproducible analyses with JupyterLab; do arithmetic with Python; work with multidimensional arrays with NumPy, SciPy, Pandas, xarray, Dask; do machine learning with Statsmodels, Scikit-learn, Dask-ML, TPOT; create books of these notebooks (containing code and notes (in Markdown, which easily transformed to HTML) and LaTeX equations) with jupyter-book, nbsphinx, git + BinderHub; store the revision history of your discoveries; publish what you've discovered and learned to public or private git repositories; and complete graded exercises with nbgrader.
But the task is to prepare for a world of mental arithmetic, no validation, no tests, no reference materials, and no search engines; and CAS (Computer Algebra Systems) systems like SymPy and Sage are not allowed.
On this machine, you can run write code, write papers, build spreadsheets and/or Jupyter notebooks, run physical simulations, explore the stars, and play games and watch videos. Videos like: Khan Academy videos and exercises that you can watch and do, with validation, until you've achieved mastery and move on to the next task on your todo.txt list.
But the task is to preserve your creativity and natural curiosity despite the compulsory education system's demands for quality control and allocative efficiency; in an environment where drama and popularity are the solutions to relatedness and acceptance needs.
I have three of these $100 calculators in my toolbox. It's been so long since I've powered them on that I'm concerned that the rechargeable AAA batteries are leaking battery acid.
For $100, you can buy an ARM notebook and install conda and conda-forge packages and build sweet visualizations to collaborate with colleagues on (with Seaborn (matplotlib), HoloViews, Altair, Plotly)
"You must buy a $100 calculator that only runs BASIC and ASM, and only use it for arithmetic so that we can measure you."
Hand tools are fun, but please don't waste any more of my compulsory time.
I would LOVE to live in a world where Jupyter notebooks are the defacto standard for post-primary education. It would take some vision and leadership to bring about that sort of change, but would far better prepare kids for the world they will be running one day.
Reinventing Home Directories
Full Disclosure: I am violently opposed to systemd and pulseaudio (before it was taken away from Poettering).
However, I think I'm seeing a pattern, Poettering identifies real issues and actually tries to fix them.
But it seems like his ambition is impeding his ability to consider those who do not want to approach the fix his way. It's never a measured approach, it /feels/ like a "lets write it now and figure out all the problems down the line and fix them" which, for core pieces of software is a dangerous mindset.
After watching the intro here I agree with him, the current state of things is not ideal, and nobody _wants_ to touch user account management on Linux. On MacOS for example, it's much more thought out, and I think Linux could do it better.
As much as I hate poetterings specific approaches (and, subsequent lockout) I have to give credit to him identifying these issues and actually trying to fix them, it's more than I do. Then again I am a lowly sysadmin type.
I like the tone of your comment, unlike many others here. But to your remark:
"But it seems like his ambition is impeding his ability to consider those who do not want to approach the fix his way. It's never a measured approach, it /feels/ like a "lets write it now and figure out all the problems down the line and fix them" which, for core pieces of software is a dangerous mindset."
I'd like to say that he forces nobody and so we should not complain about his creative endeavors. The market will decide, and you, will decide.
I disagree, you can argue the semantics of if it's him specifically or if it's another project that decides it will only target systemd/pulseaudio. But the fact that if I want a stable distro for server usage in 2019 I _must_ use a systemd distro speaks to the fact that I am indeed forced to use it.
Maybe that's just adoption though, typically the sysadmin community considers the "safe" options to be CentOS/RHEL or Debian Stable. Both of which default to systemd. (and no sysadmin is going to change the init system on production servers unless it presents some abject nightmare, short of eating 30% CPU/Mem-- it's core software, it wont be touched).
Projects like Devuan are simply not considered safe choices by the majority of the sysadmin community, and there's no alternative for RHEL-esque sites.
So, semantically: it's not poetterings fault. But at the same time, I'm still forced.
Sorry, but you'll need to find a word different from "forced"
What word would you recommend?
If I must do something then I am forced? no? -- and I'm not being facetious I'm genuinely curious.
If I wish to keep using production grade linux distributions (that my company has been using for 15+ years) then I must adopt systemd.
I guess I could convince my 25,000+ person organisation to change everything to a Debian based OS and subsequently move them to Debian, but that's not likely.
Why do you think that is? Why have production grade Linux distributions all chosen to adopt systemd?
With SysV init, how do you securely launch processes in cgroups, such that they'll consistently restart when the process happens to terminate, with stdout and stderr logged with consistent timestamps, with process dependency models that allow for faster boots due to parallelization?
(edit)
Journalctl is far better than `tail -f /var/log/starstar` and parsing all of those timestamps and inconsistently escaped logfile formats. There's no good way to modify everything in /etc/init.d in order to log to syslog-ng or rsyslog. Systemd and journalctl solve for that; for unified logging.
IMO, there's no question that systemd is the better way and I have zero nostalgia for spawning everything from a probably a shell specified in an /etc/init.d shebang without process restarting (and logging thereof), cgroups, and consistent logging.
> Why do you think that is? Why have production grade Linux distributions all chosen to adopt systemd?
Because RH funds a lot of software development, and Poettering works for them.
> Journalctl is far better than `tail -f /var/log/starstar`
Except when journald shits the bed and corrupts its own log file. Or when you want ship logs off-machine, yet journald does not support any standards-compliant transport so you end up have to log a second logging daemon (e.g., rsyslog)—in which case what's the damn point of the first? Or when you do a "service mysql start" and it dies, but there is no output anywhere of what went wrong, and you have to guess that you have a typo in your "my.cnf".
But other than that Mrs. Lincoln, how was the play?
> ... in an /etc/init.d shebang without process restarting (and logging thereof), cgroups, and consistent logging.
The problem is not systemd-as-init-replacement. The problem is systemd-as-kitchen-sink. udevd is an "essential" part of systemd? Really? It can't be it's own stand-alone project? Really?
When journald logfile corruption occurs, it's detected and it starts writing a new logfile.
When flatfile logfile corruption occurs, it's not detected and there are multiple logfile formats to contend with. And multiple haphazard logrotate configs.
Here's how to use a separate process to ship journald logs - from one file handle - to a remote logging service: https://unix.stackexchange.com/questions/394822/how-should-i...
While there is a systemd-journal-remote, it's not necessary for journald to try and replicate what's already solved and tested in rsyslog and syslog-ng.
It's quite a bit more work to add every new service to the syslog-ng or rsyslog configuration than to just ship one journald log.
Furthermore, service start/stop events are already in the same stream (with the same timestamp format) with the services' stdout and stderr.
Why hasn't anyone written fsck for corrupted journald recovery?
...
I have not needed to makedev and chown and chatted and chcon anything in very many years. When you accidentally newbishly delete something from a static /dev and rebooting doesn't work and you have no idea what the major minor is or was, it sucks bad.
When you're trying to boot a system on a different machine but it doesn't work because the NIC is in a different bus, it's really annoying to have to symlink /dev or modify /etc. With udevd, all you need to do is define a rule to map the busid device name to e.g. eth0. I can remember encountering the devfs race condition resulting in eth0 and eth1 being mapped to different devices on different boots; which was dangerous because firewall rules are applied to device names.
Udev has been in the kernel since 2.6.
"What problems does udev actually solve?" https://superuser.com/questions/686774/what-problems-does-ud...
With integrated udev and systemd, I have no reason to run a separate hotplugd with a different config format (again with no cgroup support) and a different logstream.
Perhaps ironically, here's a link to the presentation PDF that was posted yesterday: https://news.ycombinator.com/item?id=21036020
And my comments there:
> What a good idea.
> Here's the hyperlinkified link to the {systemd-homed.service, systemd-userdbd.service, homectl, userdbctl} sources from the PDF: https://github.com/poettering/systemd/tree/homed
> Hadn't heard of varlink: https://varlink.org/
> Is there a FIPS-like subset of the most-widely-available LUKS configs? Otherwise home directories won't work on systems that have a limited set of LUKS modules.
Serverless: slower and more expensive
It'd be interesting to see how much this same workload would cost with e.g. OpenFaaS on k8s with autoscaling to zero; but there also you'd need to include maintenance costs like OS and FaaS stack upgrades. https://docs.openfaas.com/architecture/autoscaling/
Entropy can be used to understand systems
Maximum entropy: https://en.wikipedia.org/wiki/Maximum_entropy
Here's a quote of a tweet about a (my own): comment on a schema:BlogPost: https://twitter.com/westurner/status/1048125281146421249:
> “When Bayes, Ockham, and Shannon come together to define machine learning” https://towardsdatascience.com/when-bayes-ockham-and-shannon...
> Comment: "How does this relate to the Principle of Maximum Entropy? How does Minimum Description Length relate to Kolmogorov Complexity?"
Thanks for sharing! Despite the fact that Shannon's "A Mathematical Theory of Communication" is so accessible, I find that most in our field (stats/ML) don't often think through information-theoretic tools in a "first principles way."
Yes, KL divergences show up everywhere, but they are not derived from scratch often enough. Maybe I'm stifled by my campus bubble though :)
New Query Language for Graph Databases to Become International Standard
Graph query languages are nice and all, but what about Linked Data here? Queries of schemaless graphs miss lots of data because without a schema this graph calls it "color" and that graph calls it "colour" and that graph calls it "色" or "カラー". (Of course this is also an issue even when there is a defined schema; but it's hardly possible to just happen to have comprehensible inter or even intra-organizational cohesion without e.g. RDFS and/or OWL and/or SHACL for describing (and changing) the shape of the data)
So, the task is then to compile schema-aware SPARQL to GQL or GraphQL or SQL or interminable recursive SQL queries or whatever it is.
For GraphQL, there's GraphQL-LD (which somewhat unfortunately contains a hashtag-indeterminate dash). I cite this in full here because it's very relevant to the GQL task at hand:
"GraphQL-LD: Linked Data Querying with GraphQL" (2018) https://comunica.github.io/Article-ISWC2018-Demo-GraphQlLD/
> GraphQL is a query language that has proven to be a popular among developers. In 2015, the GraphQL framework [3] was introduced by Facebook as an alternative way of querying data through interfaces. Since then, GraphQL has been gaining increasing attention among developers, partly due to its simplicity in usage, and its large collection of supporting tools. One major disadvantage of GraphQL compared to SPARQL is the fact that it has no notion of semantics, i.e., it requires an interface-specific schema. This therefore makes it difficult to combine GraphQL data that originates from different sources. This is then further complicated by the fact that GraphQL has no notion of global identifiers, which is possible in RDF through the use of URIs. Furthermore, GraphQL is however not as expressive as SPARQL, as GraphQL queries represent trees [4], and not full graphs as in SPARQL.
> In this work, we introduce GraphQL-LD, an approach for extending GraphQL queries with a JSON-LD context [5], so that they can be used to evaluate queries over RDF data. This results in a query language that is less expressive than SPARQL, but can still achieve many of the typical data retrieval tasks in applications. Our approach consists of an algorithm that translates GraphQL-LD queries to SPARQL algebra [6]. This allows such queries to be used as an alternative input to SPARQL engines, and thereby opens up the world of RDF data to the large amount of people that already know GraphQL. Furthermore, results can be translated into the GraphQL-prescribed shapes. The only additional requirement is their queries would now also need a JSON-LD context, which could be provided by external domain experts.
> In related work, HyperGraphQL [7] was introduced as a way to expose access to RDF sources through GraphQL queries and emit results as JSON-LD. The difference with our approach is that HyperGraphQL requires a service to be set up that acts as a intermediary between the GraphQL client and the RDF sources. Instead, our approach enables agents to directly query RDF sources by translating GraphQL queries client-side.
All of these RDFS vocabularies and OWL ontologies provide structure that minimizes the costs of merging and/or querying multiple datasets: https://lov.linkeddata.es/dataset/lov/
All of these schema.org/Dataset s in the "Linked Open Data Cloud" are easier to query than a schemaless graph: https://lod-cloud.net/ . Though one can query schemaless graphs with SPARQL, as well.
For reference, RDFLib has a bunch of RDF graph implementations over various key/value and SQL store backends. RDFLib-sqlachemy does query parametrization correctly in order to minimize the risk of query injection. FOR THE RECORD, SQL Injection is the CWE Top 25 #1 most prevalent security weakness; which is something that any new spec and implementation should really consider before launching anything other than an e.g. overly-verbose JSON-based query language that people end up bolting a micro-DSL onto. https://github.com/RDFLib/rdflib-sqlalchemy
Most practically, I frequently want to read a graph of objects into RAM; update, extend, and interlink; and then transactionally save the delta back to the store. This requires a few things: (1) an efficient binary serialization protocol like Apache Arrow (SIMD), Parquet, or any of the BSON binary JSONs; (2) a transactional local store that can be manually synchronized with the remote store until it's consistent.
SPARQL Update was somewhat of an out-of-scope afterthought. Here's SPARQL 1.1 Update: https://www.w3.org/TR/sparql11-update/
Here's SOLID, which could be implemented with SPARQL on GQL, too; though all the re-serialization really shouldn't be necessary for EAV triples with a named graph URI identifier: https://solidproject.org/
5 star data: PDF -> XLS -> CSV -> RDF (GQL, AFAIU (but with no URIs(!?))) -> LOD https://5stardata.info/en/
Linked Data tends to live in a semantic web world that has a lot of open world assumptions. While there are a few systems like this out there, there aren't many. More practically focused systems collapse this worldview down into a much simpler model, and property graphs suit just fine.
There's nothing wrong with enabling linked data use cases, but you don't need RDF+SPARQL+OWL and the like to do that.
The "semantic web stack" I think has been shown by time and implementation experience to be an elegant set of standards and solutions for problems that very few real world systems want to tackle. In the intervening 2 full generations of tech development that have happened since a lot of those standards were born, some of the underlying stuff too (most particularly XML and XML-NS) went from indispensable to just plain irritating.
> Linked Data tends to live in a semantic web world that has a lot of open world assumptions. While there are a few systems like this out there, there aren't many. More practically focused systems collapse this worldview down into a much simpler model, and property graphs suit just fine.
Data integration is cost prohibitive. In n years time, the task is "Let's move all of these data silos into a data lake housed in our singular data warehouse; and then synchronize and also copy data around to efficiently query it in one form or another"
Linked data enables data integration from day one: enables the linking of tragically-silo'd records within disparate databases
There are very very many systems that share linked data. Some only label some of the properties with URIs in templates. Some enable federated online querying.
When you develop a schema for only one application implementation, you're tragically limiting the future value of the data.
> There's nothing wrong with enabling linked data use cases, but you don't need RDF+SPARQL+OWL and the like to do that.
Can you name a property graph use case that cannot be solved with RDFS and SPARQL?
> The "semantic web stack" I think has been shown by time and implementation experience to be an elegant set of standards and solutions for problems that very few real world systems want to tackle.
TBH, I think the problem is that people don't understand the value in linking our data silos through URIs; and so they don't take the time to learn RDFS or JSON-LD (which is pretty simple and useful for very important things like SEO: search engine result cards come from linked data embedded in HTML attributes (RDFa, Microdata) or JSON-LD)
The action buttons to 'RSVP', 'Track Package', anf 'View Issue' on Gmail emails are schema.org JSON-LD.
Applications can use linked data in any part of the stack: the database, the messages on the message queue, in the UI.
You might take a look at all of the use cases that SOLID solves for and realize how much unnecessary re-work has gone into indexing structs and forms validation. These are all the same app with UIs for interlinked subclasses of https://schema.org/Thing with unique inferred properties and aggregations thereof.
> In the intervening 2 full generations of tech development that have happened since a lot of those standards were born, some of the underlying stuff too (most particularly XML and XML-NS) went from indispensable to just plain irritating.
Without XSD, for example, we have no portable way to share complex fractions.
There's a compact representation of JSON-LD that minimizes record schema overhead (which gzip or lzma generally handle anyway)
https://lod-cloud.net is not a trivial or insignificant amount of linked data: there's real value in structuring property graphs with standard semantics.
Are our brains URI-labeled graphs? Nope, and we spend a ton of time talking to share data. Eventually, it's "well let's just get a spreadsheet and define some columns" for these property graph objects. And then, the other teams' spreadsheets have very similar columns with different labels and no portable datatypes (instead of URIs)
> Can you name a property graph use case that cannot be solved with RDFS and SPARQL?
No - that's not the point. Of course you can do it with RDFS + SPARQL. For that matter you could do it with redis. Fully beside the point.
What's important is what the more fluent and easy way to do things is. People vote with their feet, and property graphs are demonstrably easier to work with for most use cases.
“Easier” is completely subjective, no way you can demonstrate that.
RDF solves a much larger problem than just graph data model and query. It addresses data interchange on the web scale, using URIs, zero-cost merge, Linked Data etc.
> “Easier” is completely subjective, no way you can demonstrate that.
I agree it's subjective. While there's no exact measurement for this sort of thing, the proxy measure people usually use is adoption; and if you look into for example Cypher vs. SPARQL adoption, Neo4j vs. RDF store adoption, people are basically voting with their feet.
From my personal experiences developing software with both, I've found property graphs much simpler and a better map for how people think of data.
It's true that RDF tries to solve data interchange on the web scale. That's what it was designed for. But the original design vision, in my view, hasn't come to fruition. There are bits and pieces that have been adopted to great effect (things like RDF microformats for tagging HTML docs) but nothing like what the vision was.
What was the vision?
The RDFJS "Comparison of RDFJS libraries" wiki page lists a number of implementations; though none for React or AngularJS yet, unfortunately. https://www.w3.org/community/rdfjs/wiki/Comparison_of_RDFJS_...
There's extra work to build general purpose frameworks for Linked Data. It may have been hard for any firm with limited resources to justify doing it the harder way (for collective returns)
Dokieli (SOLID (LDP,), WebID, W3C Web Annotations,) is a pretty cool - if deceptively simple-looking - showcase of what's possible with Linked Data; it just needs some CSS and a revenue model to pay for moderation. https://dokie.li/
> property graphs are demonstrably easier to work with for most use cases.
How do you see property graphs as distinct from RDF?
People build terrible apps without schema or validation and leave others to clean that up.
> How do you see property graphs as distinct from RDF?
This is the full answer: https://stackoverflow.com/a/30167732/2920686
I added an answer in context to the comments on the answer you've linked but didn't add a link from the comments to the answer. Here's that answer:
> (in reply to the comments on this answer: https://stackoverflow.com/a/30167732 )
> When an owl:inverseOf production rule is defined, the inverse property triple is inferred by the reasoner either when adding or updating the store, or when selecting from the store. This is a "materialized relation"
> Schema.org - an RDFS vocabulary - defines, for example, https://schema.org/isPartOf as the inverse property of hasPart. If both are specified, it's not necessary to run another graph pattern query to traverse a directed relation in the other direction. (:book1 schema:hasPart ?o), (?o schema:isPartOf :book1), (?s schema:hasPart :chapter2)
> It's certainly possible to use RDFS and OWL to describe schema for and within neo4j property graphs; but there's no reasoner to e.g. infer inverse properties or do schema validation.
> Is there any RDF graph that neo4j cannot store? RDF has datatypes and languages for objects: you'd need to reify properties where datatypes and/or languages are specified (and you'd be re-implementing well-defined semantics)
> Can every neo4j graph be represented with RDF? Yes.
> RDF is a representation for graphs for which there are very many store implementations that are optimized for various use cases like insert and query performance.
> Comparing neo4j to a particular triplestore (with reasoning support) might be a more useful comparison given that all neo4j graphs can be expressed as RDF.
And then, some time later, I realize that I want/need to: (3) apply production rules to do inference at INSERT/UPDATE/DELETE time or SELECT time (and indicate which properties were inferred (x is a :Shape and a :Square, so x is also a :Rectangle; x is a :Rectangle and :width and :height are defined, so x has an :area)); (4) run triggers (that execute code written in a different language) when data is inserted, updated, modified, or linked to; (5) asynchronously yield streaming results to message queue subscribers who were disconnected when the cached pages were updated
A Python Interpreter Written in Python
What an excellent 500 lines introduction to the byterun bytecode interpreter / virtual machine: https://github.com/nedbat/byterun
Also, proceeds from optional purchases of the AOSA books go to Amnesty International. https://aosabook.org/
Reinventing Home Directories – systemd-homed [pdf]
What a good idea.
Here's the hyperlinkified link to the {systemd-homed.service, systemd-userdbd.service, homectl, userdbctl} sources from the PDF: https://github.com/poettering/systemd/tree/homed
Hadn't heard of varlink: https://varlink.org/
Is there a FIPS-like subset of the most-widely-available LUKS configs? Otherwise home directories won't work on systems that have a limited set of LUKS modules.
Weld: Accelerating numpy, scikit and pandas as much as 100x with Rust and LLVM
There's also RustPython, a Rust implementation of CPython 3.5+: https://news.ycombinator.com/item?id=20686580
> https://github.com/RustPython/RustPython
Is this basically what Cython and PyPy are trying to do, but with Rust?
PyPy is a JIT compiler. RustPython is an interpreter.
And Cython is an AOT compiler for a superset of Python.
RustPython seems to be modestly aiming for a reimplementation of CPython.
Maybe "dialect" would be more accurate than "superset"? I don't think Cython is technically a superset of Python, since I think runtime metaprogramming features like __dict__ and monkey-patching are significantly altered or restricted?
Cython is a reification of the interpretation of a python program. Ie it converts the python code into the equivalent CPython API calls (which are all in C) thereby allowing the developer to interperse real C code. Anything you could do in python you could technically do in Cython, although it would be much more verbose.
Yeah, I was going to make a similar comment. It's a dialect of CPython, and certainly there are extensions required to make it usable. But I'm not sure it is a strict superset of the full Python language.
Convention wound lean towards calling it RPython
Why?
Craftsmanship–The Alternative to the 4 Hour Work Week
> To be successful over the course of a career requires the application and accumulation of expertise. This assumes that for any given undertaking you either provide expertise or you are just a bystander. It’s the experts that are the drivers — an expertise that is gained from a curiosity, and a mindset of treating one’s craft very seriously.
Solar and Wind Power So Cheap They’re Outgrowing Subsidies
USD$5.2 trillion was spent globally on fossil fuel subsidies in 2017
https://www.imf.org/en/Publications/WP/Issues/2019/05/02/Glo...
"hey fossil fuels, let's 1v1" sincerely, solar
(From Forbes: United States Spend Ten Times More On Fossil Fuel Subsidies Than Education)
A few things:
1) That's not what was spent, it's what this paper projected was spent.
2) I think this paper is defining subsidy in a way that we don't usually use that word. They're calling the costs associated with anthropogenic climate change and pollution "subsidies", most people would just call those things "costs."
Using the word "subsidy" implies that governments are actively taking tax dollars and giving it to fossil fuel companies and consumers. That doesn't appear to be what's going on for the most part.
Look at Figure 4, the bulk of the so-called "subsidies" are "global warming" and "local pollution" those aren't what most people would call subsidies, they're costs.
I'm saying this just to clarify things, personally I am supportive of massive tax increases on carbon and massive subsidies for renewables.
It’s an unconventional way to use “subsidy,” but I think it fits.
Imagine a garbage company. The government pays them so they can buy land where they dump their garbage. Obviously a subsidy.
Now, let’s say the government buys the land themselves then gives it to the company for dumping. No money changes hands but this is still pretty clearly a subsidy.
Instead of giving the company land, the government retains ownership, but lets the company dump there for free. Still a pretty clear subsidy.
Instead of buying the land, the government just takes it. Now no money is involved at all, but it’s still a subsidy.
Instead of taking the land, the government just declares that it’s legal for the garbage company to dump trash on other people’s land, and the owners just have to deal with it. This is quite different in the details from the original subsidy, but the overall effect is essentially the same.
Polluters are in a situation that’s exactly like this last scenario. They get to dump their trash on everyone’s property and don’t have to pay for the privilege. They’re being subsidized in an amount equal to whatever payment it would take to get everyone to willingly accept this trash.
This is wrong because subsidies aren't just externalized costs. They are defined as monetary gifts. You're extending the definition by analogy and arguing that a company externalizing its costs is the same thing as a subsidy, but a subsidy implies direct and deliberate support.
This is important for a couple of reasons: The main one is that it's confusing to redefine subsidy when you can just say "cost" and have people understand you. People reading this survey might see the $5.2T number and assume that that cost is in addition to whatever the cost of climate change is and will have to read the paper to understand otherwise. This is unnecessarily confusing even if one were to grant the logic of it.
In addition, when people discuss subsidies, they are often most interested in government policy. The purpose of a subsidy often is to increase a certain kind of business, so we might worry about unnecessarily funding the fossil fuels and thus encouraging climate change in a manner above and beyond simply allowing them to be used the way we've always done, but that's not quite what's going on here.
One thing to keep in mind is that it's a lot easier to measure a genuine government subsidy than an externality. So the distinction matters in that regard as well. Any measure of the cost of climate change is to at least some degree speculation, whereas any attempt to measure the direct amount of money given to fossil fuel companies can probably be much more exact.
Gifts in kind aren’t subsidies? If the government gives equipment or resources or labor, rather than money, it’s no longer a subsidy? That sure doesn’t fit how I understand the term. What would you call those?
"Gifts in kind aren’t subsidies?"
They can be but that's less common. The key difference is that a subsidy is direct and an active policy.
For example, when people talk about subsidies for renewables, they aren't talking about any externalized costs of manufacture, which do exist, they are talking about direct government gifts and tax breaks deliberately put in place to encourage investment in renewables. When people talk about subsidies given to fossil fuels the same is true, especially when they are being compared to renewables as is the case in this discussion.
Edit: removed word "decision" to clarify my meaning
Then it's a gift in kind and a subsidy.
When I litter, I get fined, but those companies are not when they litter e.g. carbon all over. When I dump chemicals into nature, I get fined for polluting the environment or even imprisoned outright, those companies do not e.g. when it's in the form of "emissions". Cars in my country get taxed directly or indirectly (through fuel) based in part on emissions, while a lot of commercial vehicles and fuels for these vehikles get a reduced rate or even excluded from taxation. But cars here are taxed far less so than in other European countries. etc.
The governments are clearly aware that pollution and dumping your garbage are things you should not do or at least minimize. They made laws against it, but actively decided to exclude certain business sectors and/or certain types of pollution, or actively decided not to regulate or tax certain types of pollution while regulating/taxing others.
"Cars in my country get taxed directly or indirectly (through fuel) based in part on emissions"
Unless these emmission taxes are calibrated to the cost of climate change then this argument is missing the point. Taxes and subsidies are often instituted in response to negative and positive externalities, but that doesn't change the fact that they are different things. This is important when trying to draw policy comparisons which is our situation here.
You seem to want to argue that subsidizing a business and not taxing them on an externality is somehow morally the same thing, and that's an entirely different discussion, but it doesn't mean that they are factually the same thing. There is a practical difference in terms of how things are measured and how policies are compared across industries and governments and that difference matters.
>Unless these emmission taxes are calibrated to the cost of climate change then this argument is missing the point.
The stated policy goal in part is to reduce emissions to met climate targets to fight climate change, so yes.
>You seem to want to argue that subsidizing a business and not taxing them on an externality is somehow morally the same thing
It is.
>but it doesn't mean that they are factually the same thing.
First of all, the meaning of words and political concepts are never factual.
But I'd still argue that at a high level they are the same, and both are subsidies. In both cases the government refuses money it would otherwise collect from different parties, thereby gifting those entities value you can put a price tag on.
Those decisions are active decisions NOT to do something (while doing something about the same thing or very similar things when it comes to other parts of the population), at least at this point.
The only distinction I'd make is between direct subsidies (the government forks over money) and indirect/implicit subsidies (the government decides not to make certain entities pay for certain things for which other entities have to pay the government).
">but it doesn't mean that they are factually the same thing.
First of all, the meaning of words and political concepts are never factual."
You're still missing the point. The point is that there is a distinction between a subsidy and an externality and that distinction is important. It's important for measurement reasons (The exact monetary amount a government spends on something is easier to measure than the indirect cost of a policy,) and for simple communication reasons. It makes no sense to talk about instituting a tax to cover a subsidy. You institute a tax to cover an externality. It also matters because there are ways of dealing with externalities other than taxes and subsidies and reducing the language makes this more confusing. It's especially confusing when the distinction is made in one discussion (about renewables) but not in the other (about fossil fuels.)
There is a term for using an unexpected definition for a word that already has a widely used definition during a discussion, that is a 'stipulative definition'. It's dishonest to do so without being clear upfront or in response to a discussion where the original definition is in use. This results in equivocation. Whether or not a subsidy is morally equivalent to an externality is a moot point if you're willing toss about with the language. My work involves financial reporting and if my employer asked for one set of numbers and I gave him another that I argued were 'morally equivalent' would pretty clearly be in the wrong, even if the point I was making about moral equivalence was correct.
No, you're splitting hairs.
There are direct and indirect subsidies. Indirect subsidies include externalities: external costs paid by everyone else (that the government should be incentivizing reductions in by requiring the folks causing them to pay)
Semantic digressions aside, they're earning while everyone else pays costs resultant from their operations (and from our apparent inability to allocate with e.g. long term security, health, and prosperity as primary objectives for the public sphere)
Handing money to polluters helps polluters, and failing to disincentivize externalities also helps polluters, but it's OK to call one thing a subsidy and the other thing poor governance.
"Subsidy" carries a connotation of purposeful action to help something. Subsidizing a bad thing is worse than merely allowing it to happen or looking the other way. It seems like you want to re-label things in "B" by the label for "A" to make it sound worse.
>"Subsidy" carries a connotation of purposeful action to help something.
Purpouseful action implies awareness and intention.
Externality implies not yet recognized.
Having 2 words for similar concepts does not mean they are different concepts. "heavy" and "massive" are two different words but they convey essentially the same thing.
I would agree that unrecognized externalities are not subsidies, say CO2 pollution before it was recognized, but then again since they were literally not recognized as externalities (yet).
But the very moment CO2 pollution is recognized, the previously unrecognized externality is to be instantly viewed as a subsidy.
History repeats itself, the first time as a tragedy, from then on as a farce.
[deleted]
Why do subsidies have to be an active policy decision?
You’re right, when people talk about fossil fuel subsidies, that’s thinking of government gifts and tax breaks. My point is that this is a deeply inadequate way to think about it. If you only look at these, and compare fossil fuels to renewables, you’ll come away thinking that fossil fuels are far cheaper than they actually are.
"If you only look at these, and compare fossil fuels to renewables, you’ll come away thinking that fossil fuels are far cheaper than they actually are."
In the case of this discussion, you're looking at it backwards. The subject of fossil fuel subsidies was brought up in comparison to subsidies for renewables which in this case were already defined as specific government programs. If it weren't for that, I wouldn't be being such a stickler. However given that we are comparing renewables to fossil fuels we need to be sure that we are measuring the same thing and to include externalities when measuring fossil fuels and not when measuring renewables is to be not measuring the same thing. (I actually got confused when I first saw the IMF link so I know my concern isn't hypothetical.)
Yes, I get that the case can be made that cost of using fossil fuels is more that the sum of direct gifts by the government to fossil fuel companies (fucking duh!) but that can be expressed without implying that governments gave $5.2T in tax dollars directly to fossil fuel companies in 2017. Where confusion is possible, it's best to make distinctions and clarify what is meant.
"subsidies" includes both direct and indirect subsidies.
We can measure direct subsidies by measuring real and effective tax rates.
We can measure indirect subsidies like healthcare costs paid by Medicare with subjective valuations of human life and rough estimates of the value of a person's health and contribution to growth in GDP, and future economic security.
But who has the time for this when we're busy paying to help folks who require disaster relief services from the government and NGOs (neither of which are preventing further escalations in costs)
Problem is that if the word "subsidies" is used exclusively to mean a specific and narrow concept around cash transfers when talking about renewables, but then you use the term to include a much broader set of scenarios when talking about fossil fuels, while directly comparing the derived numbers, then you're misusing the language in order to deceive, regardless of all else. You're communicating that the numbers represent the same calculations for both forms of energy, when you know that they actually don't.
Define the terms however you like, as long as you are careful to not to induce people to believe someone that is untrue.
Show HN: Python Tests That Write Themselves
This actually gave me another idea. What do you all think, I’d be up for trying to build it.
It would watch your program during execution and record each function calls input and output. And then create tests for each function using those inputs and outputs.
You could always go over the created tests manually and fix them or review them but if nothing else it could be a good start.
Instagram created 'MonkeyType' to do something similar for type annotations.
A lower barrier way of trying this out could be to run MonkeyType in combination with OP's hypothesis-auto (which uses type annotations to generate the tests).
pytype (Google) [1], PyAnnotate (Dropbox) [2], and MonkeyType (Instagram) [3] all do dynamic / runtime PEP-484 type annotation type inference [4]
[1] https://github.com/google/pytype
[2] https://github.com/dropbox/pyannotate
Most Americans see catastrophic weather events worsening
The stratifications on this are troubling.
> But there are wide differences in assessments by partisanship. Nine in 10 Democrats think weather disasters are more extreme, compared with about half of Republicans.
It's not a partisan issue: we all pay these costs.
> Majorities of adults across demographic groups think weather disasters are getting more severe, according to the poll. College-educated Americans are slightly more likely than those without a degree to say so, 79 percent versus 69 percent.
Weather disasters are getting more severe. It is objectively, quantitatively true that weather disasters are getting more frequent and more severe.
> Weather disasters are getting more severe. It is objectively, quantitatively true that weather disasters are getting more frequent and more severe.
Source? What definitions are being used for severity? How is the sample of events selected? Is there a statistically-significant effect or might it be random variation?
> Source? What definitions are being used for severity? How is the sample of events selected? Is there a statistically-significant effect or might it be random variation?
These are great questions that any good skeptic / data scientist should always be asking. Here are some summary opinions based upon meta analyses with varyingly stringent inclusion criteria.
( I had hoped that the other top-level post I posted here would develop into a discussion, but these excerpts seem to have bubbled up. https://news.ycombinator.com/item?id=20919368 )
"Scientific consensus on climate change" lists concurring, non-commital, and opposing groups of persons with and without conflicting interests: https://en.wikipedia.org/wiki/Scientific_consensus_on_climat...
USGCRP, "2017: Climate Science Special Report: Fourth National Climate Assessment, Volume I" [Wuebbles, D.J., D.W. Fahey, K.A. Hibbard, D.J. Dokken, B.C. Stewart, and T.K. Maycock (eds.)]. U.S. Global Change Research Program, Washington, DC, USA, 470 pp, doi: 10.7930/J0J964J6.
"Chapter 8: Droughts, Floods, and Wildfire" https://science2017.globalchange.gov/chapter/8/
"Chapter 9: Extreme Storms" https://science2017.globalchange.gov/chapter/9/
"Appendix A: Observational Datasets Used in Climate Studies" https://science2017.globalchange.gov/chapter/appendix-a/
The key findings in this report do list supporting evidence and degrees of confidence in predictions about the frequency and severity of severe weather events.
I'll now proceed to support the challenged claim that disaster severity and frequency are increasing by citing disaster relief cost charts which do not directly support the claim. Unlike your typical televised debate or congressional session, I have: visual aids, a computer, linked to the sources I've referenced. Finding the datasets ( https://schema.org/Dataset ) for these charts may be something that someone has time for while the costs to taxpayers and insurance holders are certainly increasing for a number of reasons.
"Taxpayer spending on U.S. disaster fund explodes amid climate change, population trends" (2019) has a nice chart displaying "Disaster-relief appropriations, 10-year rolling median" https://www.washingtonpost.com/us-policy/2019/04/22/taxpayer...
"2018's Billion Dollar Disasters in Context" includes a chart from NOAA: "Billion-Dollar Disaster Event Types by Year (CPI-Adjusted)" with the title embedded in the image text - which I searched for - and eventually found the source of: [1] https://www.climate.gov/news-features/blogs/beyond-data/2018...
[1] "Billion-Dollar Weather and Climate Disasters: Time Series" (1980-2019) https://www.ncdc.noaa.gov/billions/time-series
Thank you!
It seems these data are mostly counted in dollars of damage or dollars of relief, which is a proxy for the severity.
Would it be correct to say there is still some question about whether dollars are a good measure of severity?
EDIT: As I am browsing the data, it's hard to disentangle actual weather events from things like lava, fires, unsound building decisions, and just the politics of money moving around.
> It is objectively, quantitatively true that weather disasters are getting more frequent and more severe.
From the article, storms may be getting slightly more severe - but "more frequent" is incorrect:
> Scientific studies indicate a warming world has slightly stronger hurricanes, but they don’t show an increase in the number storms hitting land, Colorado State University hurricane researcher Phil Klotzbach said. He said the real climate change effect causing more damage is storm surge from rising seas, wetter storms dumping more rain and more people living in vulnerable areas.
Now, we all should do what we can to address it - but we all should also examine whether we're picking and choosing only scientific observations that support our feelings/pre-conceived notions while discarding the rest of the data.
The article seems to have focused on perceptions of persons who aren't concerned with taking an evidence-based look (at various types of storms: floods, cyclones (i.e. hurricanes), severe thunderstorms, windstorms. Regardless, costs are increasing. I've listed a few sources here: https://news.ycombinator.com/item?id=20925127
"2017: Climate Science Special Report: Fourth National Climate Assessment, Volume I" > "Chapter 9: Extreme Storms" lists a number of relevant Key Findings with supporting evidence (citations) and degrees of confidence: https://science2017.globalchange.gov/chapter/9/
> we all pay these costs
... if there actually are costs to pay. You seem to be sidestepping the actual question here.
"Billion-Dollar Weather and Climate Disasters: Time Series" (1980-2019) https://www.ncdc.noaa.gov/billions/time-series
> The stratifications on this are troubling.
They are, yet almost no one I know of (in leadership positions, or the general public) seems to think they're troubling enough to put any serious effort into determining with some level of certainty why there is so much stratification along political lines on so many important issues, this being only one of them. "Conservatives are uneducated" sounds about right so that's what we'll go with it seems, regardless of whether it's actually correct. That belief may be right, but it may not.
> It's not a partisan issue: we all pay these costs.
In one perspective, but as is usually the case, there are several perspectives involved. Another non-trivial perspective is that the solution requires non-partisan cooperation, and from that perspective partisanship is not only an issue, it is a crucially important issue, so we might be well served by putting some effort into understanding the true nature of it.
> Weather disasters are getting more severe. It is objectively, quantitatively true that weather disasters are getting more frequent and more severe.
For certain definitions of "more", "severe", "objectively", "true", and "frequent".
So if the problem isn't simply that "conservatives are uneducated", what else might it be? I think the answer lies in the incredibly complex manner in which people perceive the world in general, what information they consume, how they perceive that information, and how they integrate it into their personal internal overall worldview. People think they think in facts, but they actually think in stories, and in turn this affects how they consume new information.
Take this simple article as an example, and notice the variety of perspectives we see already with just a few (14 at the time of my writing) comments posted in this thread. Notice how people aren't discussing just the content of the article, but rather including related ideas from past information they have consumed and stored (the mechanics of which we do not understand) within the mental model of the world they hold in their brain. An even better example of this behavior can be seen in the recent discussion on the CDC report on vaping: https://news.ycombinator.com/item?id=20915520 Observe all the complex perspectives based on the very same "facts" in the article. Observe all the assertions of related "facts", that are actually only opinions. Observe all the mind reading.
If you start looking for this phenomenon you will notice it everywhere, and if you pay closer attention you may also notice that certain topics are particularly prone to devolving into narrative (rather than fact) based conversation. The usual suspects are obvious: religion, gender, sexuality, etc, but smoking/vaping seems to me like somewhat of an interesting outlier. The former examples are very closely tied to personal identity, but while smoking/vaping shares an identity attribute to some degree, I feel like there's some other unseen psychological issue in play that results in such high polarization of beliefs.
My guess is you and I see very different things despite consuming the very same article. I lean conservative/libertarian (generally speaking), and I am deeply distrustful of government (for extremely good reasons I believe), so I know for a fact that my interpretation of the article is going to be heavily distorted by that. Any logical inconsistency, ambiguousness, disingenuousness, technical dishonesty, or anything else along those lines is going to get red flagged in my mind, whereas others will read it in a much more forgiving fashion. And in an article on a different political hot topic, we will switch our behaviors.
In such threads, I think it would be extremely interesting for people with opposing views to post excerpts of the parts that "catch your attention", with an explanation of why. This is kind of what happens anyway, but I'm thinking with a completely different motive: rather than quoting excerpts with commentary to argue your ~political side of the issue with the goal of "winning the argument", take an unemotional, more abstract view of your personal cognitive processing of the article, and post commentary on ~why/how you believe you feel you consider that important on a psychological level. Psychological self-analysis is famously difficult, but even with moderate success I suspect some very interesting things would rise to the surface.
> My guess is you and I see very different things despite consuming the very same article. I lean conservative/libertarian (generally speaking),
HN specifically avoids politics. In context to the in-scope article, when you say "conservative/libertarian" do you mean: fiscally conservative (haven't seen a deficit hawk in decades other than "Read my lips. No new taxes" followed by responsibly raising taxes), socially libertarian (Liberty as a fundamental right; if you're not violating the rights of others the government is not obligated or even granted the right to intervene at all), or conservative as in imposing your particular traditional standard of moral values which you believe are particular to a particular side of the aisle?
Or, do you mean that you're libertarian in regards to the need and the right to regulate business and industry in the interest of consumers ("laissez faire")? I'm certainly not the only person to observe that lack of regulation results in smog-filled cities due to un-costed 'externalities' in a blind pursuit of optimization for short-term profit.
At issue here, I think, is whether we think we can avert future escalations of costs by banding together to address climate change now; and how best to achieve the Paris Agreement targets that we set for ourselves (despite partisan denial, delusion, and indifference to increasing YoY costs [1]) https://en.wikipedia.org/wiki/Paris_Agreement
I'm personally and financially far more concerned about the long-term costs of climate change than a limited number of special interests who can very easily diversify and/or divest to take advantage of the exact same opportunities.
> and I am deeply distrustful of government (for extremely good reasons I believe), so I know for a fact that my interpretation of the article is going to be heavily distorted by that. Any logical inconsistency, ambiguousness, disingenuousness, technical dishonesty, or anything else along those lines is going to get red flagged in my mind, whereas others will read it in a much more forgiving fashion. And in an article on a different political hot topic, we will switch our behaviors.
While governments (and militaries (TODO)) do contribute substantially to emissions and resultant climate change, I think it unnecessary to qualify that unregulated decisions by industry should be the primary focus here. Industry has done far more to cause climate change than governments (which can more efficiently provide certain services useful to all citizens)
> In such threads, I think it would be extremely interesting for people with opposing views to post excerpts of the parts that "catch your attention", with an explanation of why. This is kind of what happens anyway, but I'm thinking with a completely different motive: rather than quoting excerpts with commentary to argue your ~political side of the issue with the goal of "winning the argument", take an unemotional, more abstract view of your personal cognitive processing of the article,
These people aren't doing jack about the problem because they haven't reviewed this chart: "Billion-Dollar Weather and Climate Disasters: Time Series" (1980-2019) https://www.ncdc.noaa.gov/billions/time-series
Maybe they want insurance payouts, which result in higher premiums. Maybe the people who built in those locations should be paying the costs.
> and post commentary on ~why/how you believe you feel you consider that important on a psychological level. Psychological self-analysis is famously difficult, but even with moderate success I suspect some very interesting things would rise to the surface.*
They don't even care because they refuse to accept that it's a problem.
The article was ineffectual at addressing the very real problem.
From https://news.ycombinator.com/item?id=20925127 :
> ( I had hoped that the other top-level post I posted here would develop into a discussion, but these excerpts seem to have bubbled up. https://news.ycombinator.com/item?id=20919368 )
In this observational study of perceptions, college education was less predictive than party affiliation.
Maybe reframing this as a short-term money problem [1] would result in compassion for people who are suffering billions of dollars of loss every year.
> HN specifically avoids politics.
I'm not saying this to be argumentative, but I suspect that is once again merely your perception. HN avoids (dang aggressively shuts down, to the detriment of the world imho, because if smart people can't find a way to discuss these things objectively, how do we expect the average person on the street to) political discussions that have a caustic odor, but there's plenty of political shit talking on HN.
> "conservative/libertarian" do you mean: fiscally conservative
Yes, but the real kind, not the phonies we've had for decades.
> socially libertarian (Liberty as a fundamental right; if you're not violating the rights of others the government is not obligated or even granted the right to intervene at all)
Yes.
> or conservative as in imposing your particular traditional standard of moral values which you believe are particular to a particular side of the aisle
To a degree, but here I think you're dealing with an interpretation of reality more than reality itself (not meant as an insult; it's how humans are). But yes, somewhat, and this is a whole other interesting and important conversation, that I think society should be having in a more serious / less polarized way.
> Or, do you mean that you're libertarian in regards to the need and the right to regulate business and industry in the interest of consumers ("laissez faire")?
I used to be very laissez faire, but in many specific areas I am now the exact opposite - if I was in charge, corporations would be in for a very rude awakening. But laissez faire is my default stance until facts suggest it is excessively harmful to the greater good.
> At issue here, I think, is whether we think we can avert future escalations of costs by banding together to address climate change now
100% agree. But if we willfully ignore the realities of human nature/psychology, I predict we'll never be able to even remotely band together, especially in the hyper-weaponized-meme world we now find ourselves in. Averting future escalations is the larger goal, but it is completely dependent upon cooperation, which is dependent on communication & perception. I believe perception is where we are failing most.
> I'm personally and financially far more concerned about the long-term costs of climate change than a limited number of special interests who can very easily diversify and/or divest to take advantage of the exact same opportunities.
Me too.
>> and I am deeply distrustful of government (for extremely good reasons I believe), so I know for a fact that my interpretation of the article is going to be heavily distorted by that. Any logical inconsistency, ambiguousness, disingenuousness, technical dishonesty, or anything else along those lines is going to get red flagged in my mind, whereas others will read it in a much more forgiving fashion. And in an article on a different political hot topic, we will switch our behaviors.
> While governments (and militaries (TODO)) do contribute substantially to emissions and resultant climate change, I think it unnecessary to qualify that unregulated decisions by industry should be the primary focus here. Industry has done far more to cause climate change than governments (which can more efficiently provide certain services useful to all citizens)
I think you misunderstood. Here I'm not talking about government/military damage to the environment (although that's a very big deal, the hypocrisy of which reinforces my distrust even more), I'm saying that I don't trust what they're up to at all. With a few exceptions (Bernie Sanders, etc), I am very distrustful of the true honesty and sincerity of all politicians regardless of affiliation. I suppose this is less that they're fundamentally dishonest, but rather the nature of our system is such that you have to be dishonest.
>> In such threads, I think it would be extremely interesting for people with opposing views to post excerpts of the parts that "catch your attention", with an explanation of why. This is kind of what happens anyway, but I'm thinking with a completely different motive: rather than quoting excerpts with commentary to argue your ~political side of the issue with the goal of "winning the argument", take an unemotional, more abstract view of your personal cognitive processing of the article,
> These people aren't doing jack about the problem because they haven't reviewed this chart: "Billion-Dollar Weather and Climate Disasters: Time Series" (1980-2019) https://www.ncdc.noaa.gov/billions/time-series Maybe they want insurance payouts, which result in higher premiums. Maybe the people who built in those locations should be paying the costs.
I'm afraid you've completely misunderstood. I was referring to a meta discussion, on human psychology and the nature of perception - the distinctly different way in which you and I consume, perceive, store, integrate, and recall the information in an article, not the information itself. I believe this is where the solution to these problems is hiding.
> They don't even care because they refuse to accept that it's a problem.
Here you are acting as if you are able to read the minds of other people. Intuition, which is kind of a dimensional compression of perceived reality, has evolved in humans because it is extremely useful. However, when dealing with high complexity, it can be dangerous.
> ( I had hoped that the other top-level post I posted here would develop into a discussion, but these excerpts seem to have bubbled up. https://news.ycombinator.com/item?id=20919368 )
Oh, you're not wrong on those facts I'm sure, but people don't think in facts. They think they do, but they don't actually. If we want to do something about these and other problems, you have to look for the invisible blockages, and some of them are within you and I, despite the sincerity of our intentions.
Unfortunately, even genuinely smart people (like much of the HN crowd) seem to be extremely resistant to even considering this notion. I suspect the problem is that although a person may be highly educated, at the subconscious level they are still highly tribal. But I believe if we can get people to start to realize these things, to see the world in this abstract form, then much of the rest might fall into place far easier than the current apparent polarization of beliefs would suggest.
> It is objectively, quantitatively true that weather disasters are getting more frequent and more severe.
There is no data to support such an assertion. It's confirmation bias stemming from the belief that climate change will lead to such an effect.
> "2017: Climate Science Special Report: Fourth National Climate Assessment, Volume I" > "Chapter 9: Extreme Storms" lists a number of relevant Key Findings with supporting evidence (citations) and degrees of confidence: https://science2017.globalchange.gov/chapter/9/
How about a link to a chart indicating frequency and severity of severe weather events?
The Paris Agreement is predicated upon the link between human actions, climate change, and severe weather events. 195 countries have signed the Paris Agreement with consensus that what we're doing is causing climate change.
Here are some climate-relevant poll questions:
Do you think the costs of disaster relief will continue to increase due to frequency and severity of severe weather events?
Does it make sense to spend more on avoiding further climate change now rather than even more on disaster relief later?
How can you help climate refugees? Do you donate to DoD and National Guards? Do you donate to NGOs? How can we get better at handling more frequent and more severe disasters?
Emergent Tool Use from Multi-Agent Interaction
Inkscape 1.0 Beta 1
Where Dollar Bills Come From
Monetary Policy Is the Root Cause of the Millennials’ Struggle
Volatility works out for people who save (who park capital in liquid assets that aren't doing work in order to have wheat for the eventual famine). These guys. They save, short like heck when the market is falling, and swoop in to save the day. What a great time to be selling 0% loans.
Personal Savings Rate (PSR) stratified by greatest generation and not greatest generation is also relevant. Are relatively fixed living expenses higher now? Yes. Is my generation just blowing what they could invest into interest-bearing investments on unnecessary stuff from Amazon? Yes. And expensive meals and drinks.
How have corporate profits and wages changed?
In their day, you put you gosh-danged money aside. For later. So that you have money later.
And that is why you should buy my book, entitled: "Invest in things with long term returns: don't buy shtuff you don't f need, save for tomorrow; and other financial advice"
Which brings me to: the cost of college textbooks and a college education in terms of average hourly wages.
By the way, over the longer term, index funds are likely to outperform funds. Gold may be likely to outperform the stock market. And, over the recent term -- this is for all you suckers out there -- cryptocurrencies have outperformed all stock and commodities markets. How much total wealth is being created on an annual basis here?
Payday loans have something like 300% APY.
How does 2% inflation affect trade when other central banking cabals haven't chosen the same target? "Devaluation"! "Treachery"!
Non-root containers, Kubernetes CVE-2019-11245 and why you should care
> At the same time, all the current implementations of rootless containers rely on user namespaces at their core. Not to be confused with what is referred to as non-root containers in this article, rootless containers are containers that can be run and managed by unprivileged users on the host. While Docker and other runtimes require a daemon running as root, rootless containers can be run by any user without additional capabilities.
non-root / rootless
How do black holes destroy information and why is that a problem?
man, watch the PBS YouTube channel SpaceTime. They're short, informative, and very well done. they dedicate a handful of episodes to getting you ready for Hawking radiation.
"Why Quantum Information is Never Destroyed" re: determinism and T-Symmetry ("time-reversal symmetry") by PBS SpaceTime https://youtu.be/HF-9Dy6iB_4
Classical information is 'collapsed' quantum information, so that would mean that classical information is never lost either.
There appear to be multiple solutions for Navier-Stokes; i.e. somewhat chaotic.
If white holes are on the other side of black holes, Hawking radiation would not account for the entirety of the collected energy/information. Is our visible universe within a white hole? Is everything that's ever been embedded in the sidewall of a black hole shredder?
Maybe even recordings of dinosaurs walking; or is that lemurs walking in reverse?
Do 1/n, 1/∞, and n/∞ approach a symbolic limit where scalars should not be discarded; with piecewise operators?
Banned C standard library functions in Git source code
FWIW, here's awesome-static-analysis > Programming Languages > C/C++: https://github.com/mre/awesome-static-analysis/blob/master/R...
These tools have lists of functions not to use. Most of them — at least the security-focused ones — likely also include: strcpy, strcat, strncpy, strncat, sprints, and vsprintf just like banned.h
Ask HN: What's the hardest thing to secure in a web-app?
"OWASP Top 10 Most Critical Web Application Security Risks" https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Proje...
> A1:2017-Injection, A2:2017-Broken Authentication, A3:2017-Sensitive Data Exposure, A4:2017-XML External Entities (XXE), A5:2017-Broken Access Control, A6:2017-Security Misconfiguration, A7:2017-Cross-Site Scripting (XSS), A8:2017-Insecure Deserialization, A9:2017-Using Components with Known Vulnerabilities, A10:2017-Insufficient Logging&Monitoring
"OWASP Top 10 compared to SANS CWE 25" https://www.templarbit.com/blog/2018/02/08/owasp-top-10-vs-s...
Crystal growers who sparked a revolution in graphene electronics
> This seven-metre-tall machine can squeeze carbon into diamonds
OT but, is this a thing now? Diamonds can be entangled.
Don't know what you mean by entangled. They mention forming diamonds using the heat and pressure of the press, which is a well known technique to alter the crystal lattice between carbon atoms.
Does it take more energy than mining for diamonds?
> Quantum Entanglement Links 2 Diamonds: Usually a finicky phenomenon limited to tiny, ultracold objects, entanglement has now been achieved for macroscopic diamonds at room temperature (2011) https://www.scientificamerican.com/article/room-temperature-...
It takes less human suffering.
Things to Know About GNU Readline
I map <up> to history-search-backward in my .inputrc; so I can type 'sudo ' and press <up> to cycle through everything starting with sudo:
# <up> -- history search backward (match current input)
"\e[A": history-search-backward
# <down> -- history search forward (match current input)
"\e[B": history-search-forward
https://github.com/westurner/dotfiles/blob/develop/etc/.inpu...I do the exact same thing. This will probably be considered sacrilege by some but I also remap tab completion to cycle through all matches instead of stopping at the first ambiguous character:
# Set up tab and Shift-tab to cycle through completion options
"\e[Z": "\e-1\t"
TAB: menu-complete
# Alt-S is normal style TAB complete
"\es": complete
> I also remap tab completion to cycle through all matches instead of stopping at the first ambiguous character
Windows PowerShell does this and I love it. I don’t understand why people wouldn’t want this
Because instead of helping me enter a long filename by its distinctive parts, it forces flat mental read-decide-skip loop based on first few letters, which is slow and cannot be narrowed down without typing out additional letters by hand. It is like selecting from a popup menu through a 1-line window.
If it presented a real popup for file selection, combining both worlds (tab to longest match, untab to undo, up-down to navigate), that would be great. But it doesn’t.
> which is slow and cannot be narrowed down without typing out additional letters by hand.
In a PowerShell environment specifically, you usually end up having to backspace a dozen or more characters completed from a long command name before you can resume narrowing the search. It's less of a problem on a Unix shell where commands tend to be short.
Interestingly, that's one of the things I hate about powershell (and dos)
zsh works like that too.
I did this a lot, too, until I started using fzf and mapped <Control-r> to fuzzy history search. It is really useful, you might like it!
I came here to post exactly this! Very useful.
Two more quick tips
1. Undo is ^_ (ctrl+shift+-)
2. To learn more tricks...
bindkey -p> Undo is ^_ (ctrl+shift+-)
I use ctrl-/. I don't recall whether it's a default binding or not, but at least you don't have to press the shift key :-)
Both are valid undo bindings in emacs so this is probably the reason that they are both supported. (I think this is an arrogant of a time where the keys were indistinguishable (and I think maybe they still are mostly indistinguishable to terminal apps))
Is this macro from the article dangerous because it doesn't quote the argument?
Control-j: "\C-a$(\C-e)"
I can never remember how expansion and variable substitution work in shells.The macro means:
\C-a: beginning of line
$: self-insert
(: self-insert
\C-e: end of line
): self-insert
So if your prompt (with | for cursor) looks like this: grep 'blah blah' |some_file
And you execute that macro, you get: $(grep 'blah blah' some_file)|
Which is correctly quoted and I think always correctly quotes whatever is at the prompt (unless you e.g. get half way through a string and press enter so the beginning of the prompt is halfway through a string, or maybe if you have multiple lines)Show HN: Termpage – Build a webpage that behaves like a terminal
This looks useful.
FWIW, you can build a curses-style terminal GUI with Urwid (in Python) and use that through the web. AFAIU, it requires Apache; but it's built on Tornado (which is now built on Asyncio) so something more lightweight than Apache on a Pi should definitely be doable. Termpage with like a Go or Rust REST API may still be more lightweight, but more work.
Vimer - Avoid multiple instances of GVim with gvim –remote[-tab]-silent wrapper
I have a shell script I named 'e' (for edit) that does basically this. If VIRTUAL_ENV_NAME is set (by virtualenvwrapper), e opens a new tab in that gui vim remote if gvim or macvim are on PATH, or just in a console vim if not. https://github.com/westurner/dotfiles/blob/develop/scripts/e
'editwrd'/'ewrd'/'ew' does tab-completion relative to whatever $_WRD (working directory) is set to (e.g. by venv) and calls 'e' with that full path: https://github.com/westurner/dotfiles/blob/develop/scripts/_...
It's unfortunately not platform portable like vimer, though.
Electric Dump Truck Produces More Energy Than It Uses
Ask HN: Let's make an open source/free SaaS platform to tackle school forms
I have 4 kids. I am filling out all the start of school forms for each kid. I have to fill out these same forms each year. Are you doing the same thing? Let's make this year the last year we are manually filling out forms -- let's build a SaaS platform for school forms. Community built, open-sourced, free.
Brief sketch of the idea: survey monkey + docusign, but with a 100 pre-built templates for K-12 school situations. Medical emergency form. Carpool form. Field trip permission form. Backend gives schools an easy way to customize and track forms. Forms are emailed to parents and filled out online. Parent's information is saved so that any new form is pre-filled in with as much known info as possible.
Anyone feeling the same pain? Anyone want to join with me and do it?
Technically, a checkbox may qualify as a digital signature; however, identification / authentication and storage integrity are fairly challengeable (just as a written signature on a piece of paper with a date written on it is challengeable)
Given that notarization is not required for parental consent forms, I'm not sure what sort of server security expense is justified or feasible.
How much does processing all of the paper forms cost each school? Per-student?
In terms of storing digital record of authorization, a private set of per-student OpenBadges with each OpenBadge issued by the school would be easy enough. W3C Verified Claims (and Linked Data Signatures) are the latest standards for this sort of thing.
We could evaluate our current standards for chain of custody in regards to the level of trust we place in commercial e-signature platforms.
The school could send home a sheet with a QR code and a shorturl, but that would be more expensive than running hundreds of copies of the same sheet of paper.
The school could require a parent or guardian's email address for each student in the SIS Student Information System and email unique links to prefilled forms requesting authorization(s).
Just as with e-Voting, assuring that the person who checks a checkbox or tries to scribble their signature with a mouse or touchscreen is the authorized individual may be more difficult than verifying that a given written signature is that of the parent or guardian authorized to authorize.
AFAIU, Google Forms for School can include the logged-in user's username; but parents don't have school domain accounts with Google Apps for Education or Google Classroom.
How would the solution integrate with schools' existing SIS (Student Information Systems)? Upload a CSV of (student, {student info}, {guardian email (s)})? This is private information that deserves security, which costs money.
Which users can log-in for the school and/or district to check the state of the permission / authorization requests and PII personally-identifiable information.
While cryptographic signatures may be overkill as a substitute for permission slips, FWIW, a timestamp within a cryptographically-signed document only indicates what the local clock was set to at the time. Blockchains have relatively indisputable timestamps ("certainly no later than the time that the tx made it into a block"), but blockchains don't solve for proving the key-person relation at a given point in time.
And also, my parent or guardian said you can take me on field trips if you want. https://backpack.openbadges.org/
Ask HN: Is there a CRUD front end for databases (especially SQLite)?
I'm currently looking for a program (a simple executable) that "opens" an SQLite database and (via introspection of the schema) without any further configuration allows simple CRUD operations on the database.
Yes, there is DB Browser and a gazillion other database administration frontends, but it should really be limited to CRUD operations. No changing the table, the schema, the indexes. Simple UI.
For users that have no idea about SQL or databases.
Is there anything like that already done and ready to use?
There are lots of apps that do database introspection. Some also generate forms on the fly, but eventually it's necessary to: specify a forms widget for a particular field because SQL schema only describes the data and not the UI; and specify security authorization restrictions on who can create, read, update, or delete data.
And then you want to write arbitrary queries to filter on columns that aren't indexed; but it's really dangerous to allow clients to run arbitrary SQL queries because there basically are no row/object-level database permissions (the application must enforce row-level permissions).
Datasette is a great tool for read-only database introspection and queries of SQLite databases. https://github.com/simonw/datasette
Sandman2 generates a REST API for an arbitrary database. https://github.com/jeffknupp/sandman2
You can generate Django models and then write admin.py files for each model/table that you want to expose in the django.contrib.admin interface.
There are a number of apps for providing a GraphQL API given introspection of a database that occurs at every startup or at runtime; but that doesn't solve for row-level permissions (or web forms)
If you have an OpenAPI spec for the REST API that runs atop The database, you can generate forms ("scaffolding") from the OpenAPI spec and then customize those with form widgets; optionally with something like json-schema.
It's not safe to allow introspected CRUD like e.g. phpMyAdmin for anything but development. If there are no e.g. foreign-key constraints specified in the SQL schema,a blindly-introspected UI very easily results in database corruption due to invalid foreign key references (because the SQL schema doesn't specify what table.column a foreign key references).
Django models, for example, unify SQL schema and forms UI in models.py; admin.py is optional but really useful for scaffolding (such as when you're doing manual testing because you haven't yet written automated tests) https://docs.djangoproject.com/en/2.2/ref/contrib/admin/#mod...
California approves solar-powered EV charging network and electric school buses
> The press release from the company said, “heavy-duty vehicles produce more particulate matter than all of the state’s power plants combined”.
> […] for instance why only “10 school buses”?
IARC has recognized diesel exhaust as carcinogenic (lung cancer) since 2012.
Are there other electric school bus programs in the US?
(edit)
https://www.trucks.com/2019/03/22/can-electric-school-buses-...
> Most school systems don’t have sufficient capital to finance the high initial costs of electric bus purchases and charging infrastructure development, he said.
> In the U.S., the school bus market is about 33,000 to 35,000 vehicles per year – about six times more than transit buses.
You May Be Better Off Picking Stocks at Random, Study Finds
How is this not the same conclusion as other studies that find that index funds out perform others?
In addition to diversification that reduces risk of overexposure to down sectors or typically over-performing assets, index funds have survivorship bias: underperforming assets are replaced by assets that meet the fund's criteria.
Yes its sad that every one jumps on index funds which can be a good idea but its not the right choice 100% of the time.
Root: CERN's scientific data analysis framework for C++
Root has some features that are very unique and powerful.
It’s used in particle physics today mostly because it allows to do performant out-of-memory, on-disk Data Processing.
With frameworks like Python pandas, you always end up having to manually partition your data if it doesn’t fit in memory. And of course, it’s C++, so by default the data analysis code is pretty performant. This makes a difference when you can iterate your analysis in one hour instead of 20.
That being said, when I last worked with it, Root was a scrambled mess with terrible interfaces and way to many fringe features, e.g. around plotting, that are better handled by Python nowadays. It even has a C++ command line!!!
I wrote a blog post back then how I thought it could be fixed: https://www.konstantinschubert.com/2016/06/18/root8-what-roo...
> With frameworks like Python pandas, you always end up having to manually partition your data if it doesn’t fit in memory.
"Pandas Docs > Pandas Ecosystem > Out of Core" lists a number of solutions for working with datasets that don't fit into RAM: Blaze, Dask, Dask-ML (dask-distributed; Scikit-Learn, XGBoost, TensorFlow), Koalas, Odo, Ray, Vaex https://pandas-docs.github.io/pandas-docs-travis/ecosystem.h...
The dask API is very similar to the pandas API.
Are there any plans for ROOT to gain support for Apache Parquet, and/or Apache Arrow zero-copy reads and SIMD support, and/or https://RAPIDS.ai (Arrow, numba, Dask, pandas, scikit-learn, XGboost, spark, CUDA-X GPU acceleration, HPC)? https://arrow.apache.org/
https://root.cern.ch/root-has-its-jupyter-kernel (2015)
> Yet another milestone of the integration plan of ROOT with the Jupyter technology has been reached: ROOT now offers a Jupyter kernel! You can try it already now.
> ROOT is the 54th entry in this list and this is pretty cool. Now not only the PyROOT, the ROOT Python bindings, are integrated with notebooks but it's also possible to express your data mining in C++ within a notebook, taking advantage of all the powerful features of ROOT - plotting (now also interactive thanks to (Javascript ROOT](https://root.cern.ch/js/)), multivariate analysis, linear algebra, I/O and reflection: all available within a notebook.
Does this work with JupyterLab now? (edit) Here's the JupyterLab extension developer guide: https://jupyterlab.readthedocs.io/en/stable/developer/extens... (edit) here's the gh issue: https://github.com/root-project/jsroot/issues/166
...
ROOT is now installable with conda: `conda install -c conda-forge root metakernel jupyterlab # notebook`
For c/c++ in Jupyter, see xeus-cling https://github.com/QuantStack/xeus-cling
Coincidentally, cling (wrapped by xeus-cling) is also a product from CERN.
Many of the CERN researchers are pretty deep into C++.
It was there that I got my template meta-programming baptism, back in 2002, when gcc was still trying to cope with template heavy code.
And curiously, also where I got my first safety heavy code reviews of C++ best practices.
I don't believe those a coincidences, but more collaborations and the right people working together :-)
MesaPy: A Memory-Safe Python Implementation based on PyPy (2018)
I gave this a spin a while back but, unfortunately didn’t get very far since the software and dependencies have grown long in the byte. Since then, I’ve found RustPython [0] which is progressing toward feature parity with CPython but entirely written in Rust (!). A side benefit is that it compiles to Web Assembly, so if you could sandbox it without too much extra overhead.
> Since then, I’ve found RustPython [0] which is progressing toward feature parity with CPython but entirely written in Rust (!). A side benefit is that it compiles to Web Assembly, so if you could sandbox it without too much extra overhead.
It's now possible to run JupyterLab entirely within a browser with jyve (JupyterLab + pyodide) https://github.com/iodide-project/pyodide/issues/431
Pyodide:
> Pyodide brings the Python runtime to the browser via WebAssembly, along with the Python scientific stack including NumPy, Pandas, Matplotlib, parts of SciPy, and NetworkX. The packages directory lists over 35 packages which are currently available.
Is the RustPython WASM build more performant or otherwise preferable to brython or pyodide?
Ask HN: Configuration Management for Personal Computer?
Hello HN,
Every couple of years I find myself facing the same old tired routine: migrating my stuff off some laptop or desktop to a new one, usually combined with an OS upgrade. Is there anything like the kind of luxuries we now consider normal on the server side (IaaS; Terraform; maybe Ansible) that can be used to manage your PC and that would make re-imaging it as easy as it is on the server side?
Ansible is worth the extra few minutes, IMHO.
+ (minimal) Bootstrap System playbook
+ Complete System playbook (that references group_vars and host_vars)
+ Per-machine playbooks stored alongside the ansible inventory, group_vars, and host_vars in a separate repo (for machine-specific kernel modules and e.g. touchpad config)
+ User playbook that calls my bootstrap dotfiles shell script
+ Bootstrap dotfiles shell script, which creates symlinks and optionally installs virtualenv+virtualenvwrapper, gitflow and hubflow, and some things with pipsi. https://github.com/westurner/dotfiles/blob/develop/scripts/b...
+ setup_miniconda.sh that creates a CONDA_ROOT and CONDA_ENVS_PATH for each version of CPython (currently py27-py37)
Over the years, I've worked with Bash, Fabric, Puppet, SaltStack, and now Ansible + Bash
I log shell commands with a script called usrlog.sh that creates a $USER and per-virtualenv tab-delimited logfiles with unique per-terminal-session identifiers and ISO8601 timestamps; so it's really easy to just grep for the apt/yum/dnf commands that I ran ad-hoc when I should've just taken a second to create an Ansible role with `ansible-galaxy init ansible-role-name ` and referenced that in a consolidated system playbook with a `when` clause. https://westurner.github.io/dotfiles/usrlog.html#usrlog
A couple weeks ago I added an old i386 netbook to my master Ansible inventory and system playbook and VScode wouldn't install because VScode Linux is x86-64 only and the machine doesn't have enough RAM; so I created when clauses to exclude VScode and extensions on that box (with host_vars). Gvim with my dotvim works great there too though. Someday I'll merge my dotvim with SpaceVim and give SpaceMacs a try; `git clone; make install` works great, but vim-enhanced/vim-full needs to be installed with the system package manager first so that the vimscript plugin installer works and so that the vim binary gets updated when I update all.
I've tested plenty of Ansible server configs with molecule (in docker containers), but haven't yet taken the time to do a full workstation build with e.g. KVM or VirtualBox or write tests with testinfra. It should be easy enough to just run Ansible as a provisioner in a Vagrantfile or a Packer JSON config. VirtualBox supports multi-monitor VMs and makes USB passthrough easy, but lately Docker is enough for everything but Windows (with a PowerShell script that installs NuGet packages with chocolatey) and MacOS (with a few setup scripts that download and install .dmg's and brew) VMs. Someday I'll write or adapt Ansible roles for Windows and Mac, too.
I still configure browser profiles by hand; but it's pretty easy because I just saved all the links in my tools doc: https://westurner.github.io/tools/#browser-extensions
Someday, I'll do bookmarks sync correctly with e.g. Chromium and Firefox; which'll require extending westurner/pbm to support Firefox SQLite or a rewrite in JS with the WebExtension bookmarks API.
A few times, I've decided to write docs for my dotfiles and configuration management policies like someone else is actually going to use them; it seemed like a good exercise at the time, but invariably I have to figure out what the ultimate command sequence was and put that in a shell script (or a Makefile, which adds a dependency on GNU make that's often worth it)
Clonezilla is great and free, but things get out of date fast in a golden master image. It's actually possible to PXE boot clonezilla with Cobbler, but, AFAICT, there's no good way to secure e.g. per-machine disk or other config with PXE. Apt-cacher-ng can proxy-cache-mirror yum repos, too. Pulp requires a bit of RAM but looks like a solid package caching system. I haven't yet tested how well Squid works as a package cache when all of the machines are simultaneously downloading the exact same packages before a canary system (e.g. in a VM) has populated the package cache.
I'm still learning to do as much as possible with Docker containers and Dockerfiles or REES (Reproducible Execution Environment Specifications) -compatible dependency configs that work with e.g. repo2docker and https://mybinder.org/ (BinderHub)
GitHub Actions now supports CI/CD, free for public repositories
This is great news for developers. The trend has been to combine version control and CI for years now. For a timeline see https://about.gitlab.com/2019/08/08/built-in-ci-cd-version-c...
This is bad news for the CI providers that depend on GitHub, in particular CircleCI. Luckily for them (or maybe they saw this coming) they recently raised a series D https://circleci.com/blog/we-raised-a-56m-series-d-what-s-ne... and are already looking to add support for more platforms. It is hard to depend on a marketplace when it starts competing with you, from planning (Waffle.io), to dependency scanning (Gemnasium acquired by us), to CI (Travis CI layoff where especially sad).
It is interesting that a lot of the things GitHub is shipping is already part of Azure DevOps https://docs.microsoft.com/en-us/azure/architecture/example-... The overlap between Azure DevOps and GitHub seems to increase instead of to be reduced. I wonder what the integration story is and what will happen to Azure DevOps.
It's a horrible trend. CI should not be tied to version control. I mean we all have to deal with it now, but I'd much rather have my CI agnostic and not have config files for it checked into the repo.
I've browsed through the article you linked to, one of the subtitles was "Realizing the future of DevOps is a single application". Also a horrible idea: I think it locks developers into a certain workflow which is hard to escape. You have an issue with your setup you can't figure out - happened to me with Gitlab CI - sorry, you're out of luck. Every application is different, DevOps processes is something to be carefully crafted for each particular case with many considerations: large/small company, platform, development cycle, people preferred workflow etc. What I like to do is to have small well tested parts constitute my devops. It's a bad idea to adopt something just because everyone is doing this.
To sum it up, code should be separate from testing, deployment etc. On our team, I make sure developers don't have to think about devops. They know how to deploy and test and they know the workflow and commands. But that's about it.
I'm an operations guy, but I think I have a different perspective. The developers I work with don't have to think about CI/CD, but the configuration still lives in the repo, I'm just a contributer to that repo like they are.
Having CI configuration separate from the code sounds like a nightmare when a code change requires CI configurations to be updated. A new version of code requires a new dependency for instance, there needs to be a way to tie the CI configuration change with a commit that introduced that dependency. That comes automatically when they're in the same repo.
Having CI configuration inside the codebase also sounds like a nightmare when changes to the CI or deployment environment require configuration changes or when multiple CI/deployment environments exist.
For example as a use case: Software has dozens of tagged releases; organization moves from deploying on AWS to deploying in a Kubernetes cluster (requiring at least one change to the deployment configuration). Now, to deploy any of the old tagged releases, every release now has to be updated with the new configuration. This gets messy because there are two different orthogonal sets of versions involved. First, the code being developed has versions and second, the environments for testing, integration, and deployment also change over time and have versions to be controlled.
Even more broadly, consider multiple organizations using the same software package. They will each almost certainly have their own CI infrastructure, so there is no one "CI configuration" that could ever be checked into the repository along with the code without each user having to maintain their own forks/patchsets of the repo with all the pain that entails.
You can create a separate repo with your own CI config that pulls in the code you want to test; and thus ignore the code's CI config file. When something breaks, you'd then need to determine in which repo something changed: in the CI config repo, or the code repo. And then, you have CI events attached to PRs in the CI config repository.
IMHO it makes sense to have CI config version controlled in the same repo as the code. Unless there's a good tool for bisecting across multiple repos and subrepos?
The Fed is getting into the Real-Time payments business
This system will need to interface with other domestic and international settlement and payments networks.
There is thus an opportunity for standards, a need for federation, and a need to make it easy for big players to offer liquidity.
As far as I understand, e.g. Ripple and Stellar solve basically exactly the 24x7x365 RTGS problem that FedNow intends to solve; and, they allow all sorts of assets to be plugged into the network. Could FedNow just use a different UNL (Unique Node List) with participating banks operating trusted validators and/or offering liquidity ("liquidity provisioning")?
Notably, Ripple is specifically positioned to do international interbank real time gross settlement (RTGS) and remittances. Ripple could integrate with FedNow directly. Most efficiently, if it complies with KYC/AML requirements, FedNow could operate an XRP Ledger. Or, each bank could operate XRP Ledgers. https://xrpl.org/become-an-xrp-ledger-gateway.html
Getting thousands of banks to comply with an evolving API / EDI spec is no small task. Blockchain solutions require API compliance, have solutions for governance where there are a number of stakeholders seeking to reach consensus, and lack single points of failure.
Here's to hoping that we've learned something about decentralizing distributed systems for resiliency.
>> In contrast, the XRP Ledger requires 80 percent of validators on the entire network, over a two-week period, to continuously support a change before it is applied. Of the approximately 150 validators today, Ripple runs only 10. Unlike Bitcoin and Ethereum — where one miner could have 51 percent of the hashing power — each Ripple validator only has one vote in support of an exchange or ordering a transaction. https://news.ycombinator.com/item?id=19195050
So, you want to get banks onboard with only one s'coin USD stablecoin; but you don't want to deal with exchanges or FOREX or anything because that's a different thing? And, this is not just yet another ACH with lower clearance time?
> Interledger Architecture
https://interledger.org/rfcs/0001-interledger-architecture/
> Interledger provides for secure payments across multiple assets on different ledgers. The architecture consists of a conceptual model for interledger payments, a mechanism for securing payments, and a suite of protocols that implement this design.
> The Interledger Protocol (ILP) is the core of the Interledger protocol suite. Colloquially, the whole Interledger stack is sometimes referred to as "ILP". Technically, however, the Interledger Protocol is only one layer in the stack.
> Interledger is not a blockchain, a token, nor a central service. Interledger is a standard way of bridging financial systems. The Interledger architecture is heavily inspired by the Internet architecture described in RFC 1122, RFC 1123 and RFC 1009.
[...]
> You can envision the Interledger as a graph where the points are individual nodes and the edges are accounts between two parties. Parties with only one account can send or receive through the party on the other side of that account. Parties with two or more accounts are connectors, who can facilitate payments to or from anyone they're connected to.
> Connectors provide a service of forwarding packets and relaying money, and they take on some risk when they do so. In exchange, connectors can charge fees and derive a profit from these services. In the open network of the Interledger, connectors are expected to compete among one another to offer the best balance of speed, reliability, coverage, and cost.
Why should we prefer an immutable, cryptographically-signed blockchain solution over SQL/BigTable/MQ for FedNow?
Blockchain and payments standards: https://news.ycombinator.com/item?id=19813340
... Here's the notice and request for comment PDF: "Docket No. OP – 1670: Federal Reserve Actions to Support Interbank Settlement of Faster Payments" https://www.federalreserve.gov/newsevents/pressreleases/file...
"Federal Reserve announces plan to develop a new round-the-clock real-time payment and settlement service to support faster payments" https://www.federalreserve.gov/newsevents/pressreleases/othe...
A Giant Asteroid of Gold Won’t Make Us Richer
> this example shows that real wealth doesn’t actually come from golden hoards. It comes from the productive activities of human beings creating things that other human beings desire.
Value, Price, and Wealth
???
I'd suggest a different triad: cost, price, value.
https://old.reddit.com/r/dredmorbius/comments/48rd02/cost_va...
Good call. I don't know where I was going with that. Cost, price, value, and wealth.
Are there better examples for illustrating the differences between these kind of distinct terms?
Less convertible collectibles like coins and baseball cards (that require energy for exchange) have (over time t): costs of production, marketing, and distribution; retail sales price; market price; and 'value' which is abstract relative (opportunity cost in terms of fiat currency (which is somehow distinct from price at time t (possibly due to 'speculative information')))
Wealth comes from relationships, margins between costs and prices, long term planning, […]
Are there better examples for illustrating the differences between these kind of distinct terms?
For concepts as intrinsically fundamental to economics as these are, the agreement and understanding of what they are, even amomg economists, is surprisingly poor. It's not even clear whether or not "wealth" refers to a flow or stock -- Adam Smith uses the term both ways. And much contemporary mainstream 'wealth creation" discussion addresses accounting profit rather than economic wealth. Or broader terms such aas ecological wealth (or natural capital). There's some progress, and Steve Keen has been synthesizing much of it recently, but the terms fare poorly.
A key issue is that "price‘ and "exchange value" are ofteen conflated, creating confusiin with use/ownership value.
Addressing your terms, "cost" and "price", and typically "value", indicate some metric of exchange or opportunity cost (or benefit). Whilst "wealth", as typically used, tends to relate to some store or accumulation. In electrical terms (a potentially, so to speak, useful analogue) the difference between voltage and charge, with current representing some other property, possibly material flows of goods or energy.
The whole question of media for exchange (currency, and the like), and durable forms of financial wealth (land, art, collectibles) is another interesting one, with discussionnby Ricardo and Jevons quite interesting -- both useful and flawed.
And don't even get me started on the near total discounting of accumulated natural capital, say, the 100-300 million year factor of time embodied in fossil fuels. The reasons and rationales for excluding that being fascinating (Ricardo, Tolstoy, Gray, Hotelling, Boulding, Soddy, Georgescu-Roegen, Daly, Keen).
You are correct that all value (and hence wealth) is relative, and hence relational.
TL;DR: Not that I'm aware.
Abusing the PHP Query String Parser to Bypass IDS, IPS, and WAF
Hrm... the solution is to keep your PHP codebase up to date...?
The solution is to throw out your WAF.
Possible solutions:
(1) Change all underscores in WAF rule URL attribute names to the appropriate non-greedy regex. Though I'm not sure about the regex the article suggests: '.' only matches one character, AFAIU.
(2) Add a config parameter to PHP that turns off the magical url parameter name mangling that no webapp should ever depend on ( and have it default to off because if you rely on this 'feature' you should have to change a setting in php.ini anyway )
Ask HN: Scripts/commands for extracting URL article text? (links -dump but)
I'd like to have a Unix script that basically generates a text file named, with the page title, with the article text neatly formatted.
This seems to me to be something that would be so commonly desired by people that it would've been done and done and done a hundred times over by now, but I haven't found the magic search terms to dig up people's creations.
I imagine it starts with "links -dump", but then there's using the title as the filename, and removing the padded left margin, wrapping the text, and removing all the excess linkage.
I'm a beginner-amateur when it comes to shell scripting, python, etc. - I can Google well and usually understand script or program logic but don't have terms memorized.
Is this exotic enough that people haven't done it, or as I suspect does this already exist and I'm just not finding it? Much obliged for any help.
Just for the record in case anyone digs this up on a later Google search, install the newspaper, unidecode, and re python libraries (pip3 install), then:
from sys import argv
from unidecode import unidecode
from newspaper import Article
import re
script, arturl = argv
url = arturl
article=Article(url)
article.download()
article.parse()
title2 = unidecode(article.title)
fname2 = title2.lower()
fname2 = re.sub(r"[^\w\s]", '', fname2)
fname2 = re.sub(r"\s+", '-', fname2)
text2 = unidecode(article.text)
text2 = re.sub(r'\n\s*\n', '\n\n', text2)
f = open( '~/Desktop/' + str(fname2) + '.txt', 'w' )
f.write( str(title2) + '\n\n' )
f.write( str(text2) + '\n' )
f.close()
I execute via from shell: #!/bin/bash
/usr/local/opt/python3/Frameworks/Python.framework/Versions/3.7/bin/python3 ~/bin/url2txt.py $1
If I want to run it on all the URLs in a text file: #!/bin/bash
while IFS='' read -r l || [ -n "$l" ]; do
~/bin/u2t "$l"
done < $1
I'm sure most of the coders here are wincing at one or multiple mistakes or badly formatted items I've done here, but I'm open to feedback ...There could be collisions where `fname2` is the same for different pages; resulting in unintentionally overwriting. A couple possible solutions: generate a random string and append it to the filename, set fname2 to a hash of the URL, replace unsafe filename characters like '/' and/or '\' and/or '\n' with e.g. underscores. IIRC, URLs can be longer than the max filename length of many filesystems, so hashes as filenames are the safest solution. You can generate an index of the fetched URLs and store it with JSON or e.g. SQLite (with Records and/or SQLAlchemy, for example).
If or when you want to parallelize (to do multiple requests at once because most of the time is spent waiting for responses from the network) write-contention for the index may be an issue that SQLite solves for better than a flatfile locking mechanism like creating and deleting an index.json.lock. requests3 and aiohttp-requests support asyncio. requests3 supports HTTP/2 and connection pooling.
SQLite can probably handle storing the text of as many pages as you throw at it with the added benefit of full-text search. Datasette is a really cool interface for sqlite databases of all sorts. https://datasette.readthedocs.io/en/stable/ecosystem.html#to...
...
Apache Nutch + ElasticSearch / Lucene / Solr are production-proven crawling and search applications: https://en.m.wikipedia.org/wiki/Apache_Nutch
> I imagine it starts with "links -dump", but then there's using the title as the filename,
The title tag may exceed the filename length limit, be the same for nested pages, or contain newlines that must be escaped.
These might be helpful for your use case:
"Newspaper3k: Article scraping & curation" https://github.com/codelucas/newspaper
lazyNLP "Library to scrape and clean web pages to create massive datasets" https://github.com/chiphuyen/lazynlp/blob/master/README.md#s...
scrapinghub/extruct https://github.com/scrapinghub/extruct
> extruct is a library for extracting embedded metadata from HTML markup.
> It also has a built-in HTTP server to test its output as JSON.
> Currently, extruct supports:
> - W3C's HTML Microdata
> - embedded JSON-LD
> - Microformat via mf2py
> - Facebook's Open Graph
> - (experimental) RDFa via rdflib
NPR's Guide to Hypothesis-Driven Design for Editorial Projects
HDD – Hypothesis-Driven Development – Research, Plan, Prototype, Develop, Launch, Review.
The article lists (and links to!) "Lean UX" [1] and Google Ventures' Design Sprint Methodology as inspirations.
[1] "Lean UX: Applying Lean Principles to Improve User Experience" http://shop.oreilly.com/product/0636920021827.do
[2] https://www.gv.com/sprint/
"How To Write A Technical Paper" [3][4] has: (Related Work, System Model, Problem Statement), (Your Solution), (Analysis), (Simulation, Experimentation), (Conclusion)
Gryphon: An open-source framework for algorithmic trading in cryptocurrency
Hey folks, I'm the primary maintainer of Gryphon. The backstory here is: I was one of the founders of Tinker, the trading company that built Gryphon as our in-house trading infrastructure. We operated 2014-18, starting with just a simple arbitrage bot, and slowly grew the operation until our trades were peaking above 20% of daily trading volume on the big exchanges.
The company has since wound down (founders moved on to other projects), but I always thought the code deserved to be shared, so we've open sourced it. Someone trying to do what we did will probably save 1.5 years of engineering effort if they build on Gryphon vs. make their own. As far as I know there isn't anything out there like this, in any market (not just cryptocurrencies).
Hope you guys like it!
> As far as I know there isn't anything out there like this, in any market (not just cryptocurrencies).
How does Gryphon compare to Catalyst (Zipline)? https://github.com/enigmampc/catalyst
They list a few example algorithms: https://enigma.co/catalyst/example-algos.html
"Ask HN: Why would anyone share trading algorithms and compare by performance?" https://news.ycombinator.com/item?id=15802834 (pyfolio, popular [Zipline] algos shared through Quantopian)
"Superalgos and the Trading Singularity" https://news.ycombinator.com/item?id=19109333 (awesome-quant,)
Looks great!
Though looking at the code -> https://github.com/garethdmm/gryphon/tree/master/gryphon/lib...
Seems quite hard to extend it with new exchanges.
Shouldn't be, you just need to write a wrapper for the exchange API with the interface defined in 'gryphon.lib.exchange.exchange_api_wrapper'. I'll add an article to the docs with more about that soon.
Would CCXT be useful here? https://github.com/ccxt/ccxt
> The ccxt library currently supports the following 135 cryptocurrency exchange markets and trading APIs:
It's possible CCXT could be used to easily wrap other exchanges into gryphon. I'm not familiar with the library so hard to guess if it would be a net win or not.
Thanks for taking the time to open source this.
So is its use case limited to arb? Or are other HFT strategies supported?
It's perfectly general: arb, market making, signal trading, ml, etc. Whatever strategy class you're thinking of, you can probably implement it on Gryphon.
Can you please explain to me how a tool written in python can be used for HFT or market making?
I’m asking because we generally used ASICS and c++ in the past, or more recently rust. Even GPUs are often difficult because they introduce milliseconds of latency.
If you want to restrict the definition of HFT to only sub-millisecond strategies you're correct. But then, all HFT is impossible in crypto, since with web request latency and rate limits, it would be very difficult to get tick speeds even in the 10s of milliseconds. It's fine if you want to call this "algo trading" instead of HFT, but I think a common understanding of the term would include Gryphon's capabilities.
In any case Gryphon uses Cython to compile itself down to C, which isn't quite as good as writing in native C but is a good chunk of the way there.
> In any case Gryphon uses Cython to compile itself down to C, which isn't quite as good as writing in native C but is a good chunk of the way there.
Would there be any advantage to asyncio with uvloop (also written in Cython (on libuv like Node) like Pandas)? https://github.com/MagicStack/uvloop
IDK how many e.g. signals routines benefit from asyncio yet.
Wow. That’s pretty cool.
Been toying with the idea of algo trading on stock market. Nice to have a reference work
I've been playing around with using lstm recurrent nets to find patterns in forex trades, with no real expectation of anything other than learning about recurrent nets (and td convolutional nets). I was able to access 15 years of historical tick data. I would imagine lack of historical pricing data would be an issue for any machine learning approach to crypto trading. Even with 15 years of daily prices I only have ~5500 samples per major currency pair. I've toyed with learning off hourly prices rather than daily, and I've also thought about creating more samples by shifting prices up or down, since perhaps the general patterns would be the same.
Ouch. Out of all the possible subjects to learn NNs on, you have picked by far the most difficult possible. Seriously. If you think of an analogue to rocketry, with the easiest being launching fireworks from a bottle and the other being a mission to Mars, you have picked a Moon landing.
I don’t even know where to begin. Financial data has an extremely low signal to noise ratio and it is fraught with pitfalls. It is highly non-normal, heteroscedastic, non-stationary and frequently changes behavioural regimes. It is irregular, the information content is itself irregular and the prices sold by vendors often have difficult to detect issues that will taint your results until you actually start trading and realise that a fundamental assumption was wrong. You may train a model on one period, and find that the market behaviour has changed and your model is rubbish. Cross validation and backtesting on black box algorithms with heavy parameter tuning is a field of study on it’s own with so many issues that endless papers have been written on each specific nuance.
Successfully building ML models for trading is an extremely difficult discipline that requires a deep understanding of the markets, the idiosyncrasies of market data, statistics and programming. Most quant shops who run successful ML Algos (they are quite rare) have dedicated data teams whose entire remit is to source and clean data. The saying of rubbish in, rubbish out is very true. Even data providers like Reuter’s or Bloomberg frequently have crap data. We pay nearly 500k a year to Reuters, and find errors in their tick data every week. Data like spot forex is a special beast because the market is decentralized. There is no exchange which could provide an authoritative price feed. Trades have been rolled back in the past and if your data feed does not reflect this, you are effectively analysing junk data.
I don’t even want to get started about the fact that trying to train an RNN on 5500 observations is folly. Did you treat the data in any way? The common way to regularise market data for ML is to resample it to information bars. This is not going to work on a daily basis, so you should start off with actual tick data.
Nearly every starry eyed junior quant goes in with the notion that you can just run some fancy ML models on some market data and you’ll get a star trading algo. That a small handful of statistical tests will tell you whether your results are meaningful, whether your data has autocorrelation or mean reverting properties. In reality, ML models are very difficult to train on financial data. Most statistical forecasting tools fail to find relationships and blindly training models on past data very rarely results in more than spurious performance in a back test.
I don’t want to discourage you by any means, but I’d start off with something easier than what you are proposing. Finance firms have entire teams dedicated to what you are trying to do and even they often fail to find anything.
Thanks for the great feedback. I have no expectations for this other than the learning, and it's already been successful on that front. Just seemed like a fun thing to poke at when most other hobbyists seem to be doing image analysis and language modeling. I've crawled a couple of forums and I get that there are a lot of people out there who think they can readily use these techniques to make money. I doubt very much that this will be the outcome in my case :).
Where I am now I am just trying to figure out how to treat the data, whether to normalize or stationarize and how to encode inputs, etc. The reason that I am working with daily prices is that the fantasy output of this would be a model that can inform a one day grid trading strategy. It may very well be that daily prices won't work for this.
Whether there's anything like an equilibrium in cryptoasset markets where there are no underlying fundamentals is debatable. While there's no book price, PoW coin prices might be rationally describable in terms of (average_estimated cost of energy + cost per GH/s + 'speculative value')
A proxy for energy costs, chip costs, and speculative information
Are there standard symbols for this?
Can cryptoasset market returns be predicted with quantum harmonic oscillators as well? What NN topology can learn a quantum harmonic model? https://news.ycombinator.com/item?id=19214650
"The Carbon Footprint of Bitcoin" (2019) defines a number of symbols that could be standard in [crypto]economics texts. Figure 2 shows the "profitable efficiency" (which says nothing of investor confidence and speculative information and how we maybe overvalue teh security (in 2007-2009)). Figure 5 lists upper and lower estimates for the BTC network's electricity use. https://www.cell.com/joule/fulltext/S2542-4351(19)30255-7
Here's a cautionary dialogue about correlative and causal models that may also be relevant to a cryptoasset price NN learning experiment: https://news.ycombinator.com/item?id=20163734
Wind-Powered Car Travels Downwind Faster Than the Wind
NOAA upgrades the U.S. global weather forecast model
> Working with other scientists, Lin developed a model to represent how flowing air carries these substances. The new model divided the atmosphere into cells or boxes and used computer code based on the laws of physics to simulate how air and chemical substances move through each cell and around the globe.
> The model paid close attention to conserving energy, mass and momentum in the atmosphere in each box. This precision resulted in dramatic improvements in the accuracy and realism of the atmospheric chemistry.
Global Forecast System > Future https://en.wikipedia.org/wiki/Global_Forecast_System#Future
A plan to change how Harvard teaches economics
This isn't an economics class. It's a public policy class. Here are the course topics:[1]
- Part I: Equality of Opportunity
- Part II: Education
- Part III: Racial Disparities
- Part IV: Health
- Part V: Criminal Justice
- Part VI: Climate Change
- Part VII: Tax Policy
- Part VIII: Economic Development and Institutional Change
This really belongs in Harvard's "JFK School of Government", not economics.
Possible topics for a modern economics intro class:
- Instability and equilibrium, or why markets oscillate.
- From zero to one, the tendency to and effects of monopoly and near-monopoly.
- Externalities, their uses and discontents.
- Debt vs. equity vs. what tax policy rewards
- Scarce resources that don't map to money - attention and time.
- Finance as a system decoupled from productive activity
[1] https://opportunityinsights.org/wp-content/uploads/2019/05/E...
Is there a single book that covers all that? That would be awesome.
Schools like Harvard don’t teach from a book.
Wrong side of the Atlantic. I didn’t have to buy a single textbook my entire undergrad because we had libraries and lecture notes. Required textbooks are far more a US thing than most other countries.
Harvard absolutely has courses taught from a book. I’m sure Mankiw requires his textbook for Harvard’s intro econ course. He wrote it, he thinks it’s good.
And he's made $42 million in royalties on the book. It's almost endearing how economists claim they are somehow immune to incentives, until you pause to consider they are primarily employed as apologists for the continuation of rent-seeking policies that entrench the rich and mighty.
> apologists for the continuation of rent-seeking policies that entrench the rich and mighty.
This.
"THE IMF CONFIRMS THAT 'TRICKLE-DOWN' ECONOMICS IS, INDEED, A JOKE" https://psmag.com/economics/trickle-down-economics-is-indeed...
> INCREASING THE INCOME SHARE TO THE BOTTOM 20 PERCENT OF CITIZENS BY A MERE ONE PERCENT RESULTS IN A 0.38 PERCENTAGE POINT JUMP IN GDP GROWTH.
> The IMF report, authored by five economists, presents a scathing rejection of the trickle-down approach, arguing that the monetary philosophy has been used as a justification for growing income inequality over the past several decades. "Income distribution matters for growth," they write. "Specifically, if the income share of the top 20 percent increases, then GDP growth actually declined over the medium term, suggesting that the benefits do not trickle down."
"Causes and Consequences of Income Inequality: A Global Perspective" (2015) https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=%22...
I'll add that we tend to overlook the level of government spending during periods trickle-down economics and confound. Change in government spending (somewhat unfortunately regardless of revenues) is a relevant factor.
Let's make this economy great again? How about you identify the decade(s) you're referring to and I'll show you the tax revenue (on income and now capital gains), federal debt per capital, and the growth in GDP.
The difference big data/data science/empirical study and 'classical' economics (by which I'd include any system of economics that seeks to explain human behavior via an underlying metatheory) is that a primarily empirical approach obscures the necessary underlying theory present in any experiment where you're trying to fit data to a curve.
For example, when you run a science experiment and you plot the data, you may find that you're looking at a line. While this is an interesting finding, it has zero predictive value for anything other than the exact situation you've collected data for. In order to formulate scientific law, you first must (a) believe that such a thing exists and (b) have some theory as to what shape the curve ought to fit. For example, a naive look at physics using an 'empirical' approach might incorrectly conclude that force is mass times acceleration. While moderately useful for many problems, this offers little predictive power in the general case. In order to actually formulate a law that can be of predictive value, you have to first consider various other laws and axioms (such as the constant speed of light for force), at which point -- by deduction, without any need of empirism -- you determine that this is wrong, and you need another kind of equation to fit your data to.
I don't know if the simplistic demand curves drawn in the original text book are correct or not. However, at least those are based on a particular set of assumptions that can be validated or not. The kind of empiricism put forth by Mr Chetty does not offer this at all.
All this is to say that, while data is useful for validation, it is not useful for prediction. The last thing we need is a black-box machine learning model to make major economic decisions off of. What we do need is proper models that are then validated, which don't necessarily need 'big data.'
> All this is to say that, while data is useful for validation, it is not useful for prediction. The last thing we need is a black-box machine learning model to make major economic decisions off of. What we do need is proper models that are then validated, which don't necessarily need 'big data.'
Hand-wavy theory - predicated upon physical-world models of equillibrium which are themselves classical and incomplete - without validation is preferable to empirical models? Please.
Estimating the predictive power of some LaTeX equations is a different task than measuring error of a trained model.
If the model does not fit all of the big data, the error term is higher; regardless of whether the model was pulled out of a hat in front of a captive audience or deduced though inference from actual data fed through an unbiased analysis pipeline.
If the 'black-box predictive model' has lower error for all available data, the task is then to reverse the model! Not to argue for unvalidated theory.
Here are a few discussions regarding validating economic models, some excellent open econometric lectures (as notebooks that are unfortunately not in an easily-testable programmatic form), the lack of responsible validation, and some tools and datasets that may be useful for validating hand-wavy classical economic theories:
"When does the concept of equilibrium work in economics?" https://news.ycombinator.com/item?id=19214650
> "Lectures in Quantitative Economics as Python and Julia Notebooks" https://news.ycombinator.com/item?id=19083479 (data sources (pandas-datareader, pandaSDMX), tools, latex2sympy)
That's just an equation in a PDF.
(edit) Here's another useful thread: "Ask HN: Data analysis workflow?" https://news.ycombinator.com/item?id=18798244
Most of the interesting economic questions are inference problems, not prediction problems The question is not "what is the best guess of y[i] given these values of x[i]'s", but what would y[i] have been for this very individual i (or country in macro-economics) if we could have wound back the clock and change the values of x[i]'s for this individual. The methods that economists know and use may not be the best, but the standard ML prediction methods do not address the same questions, and data scientists without a social / economic / medical background are often not even aware of the distinction.
Economists and social scientists try to do non-experimental causal inference. Maybe they're not good at it, maybe the very problem is unsolvable, but it's not because they don't know how Random Forests or RNNs work. Economists already know that students from single parent families do worse at school than from married families. If the problem is just to predict individual student results, number of parents in the household is certainly a good predictor. The problem facing economists is, would encouraging marriage or discouraging diveroce improve student results? Nothing in PyTorch or Tensorflow will help with the answer..
Backtesting algorithmic trading algorithms is fairly simple: what actions would the model have taken given the available data at that time, and how would those trading decisions have affected the single objective dependent variable. Backtesting, paper trading, live trading.
Medicine (and also social sciences) is indeed more complex; but classification and prediction are still the basis for making treatment recommendations, for example.
Still, the task really is the same. A NN (like those that Torch, Theano, TensorFlow, and PyTorch produce; now with the ONNX standard for neural network model interchange) learns complex relations and really doesn't care about causality: minimize the error term. Recent progress in reducing the size of NN models e.g. for offline natural language classification on mobile devices has centered around identifying redundant neuronal connections ("from 100GB to just 0.5GB"). Reversing a NN into a far less complex symbolic model (with variable names) is not a new objective. NNs are being applied for feature selection, XGBoost wins many Kaggle competitions, and combinations thereof appear to be promising.
Actually testing second-order effects of evidence-based economic policy recommendations is certainly a complex highly-multivariate task (with unfortunate ideological digression that presumes a higher-order understanding based upon seeming truisms that are not at all validated given, in many instances, any data). A causal model may not be necessary or even reasonably explainable; and what objective dependent variables should we optimize for? Short term growth or long-term prosperity with environmental sustainability?
... "Please highly weight voluntary sustainability reporting metrics along with fundamentals" when making investments and policy decisions?
Were/are the World3 models causal? Many of their predictions have subsequently been validated. Are those policy recommendations (e.g. in "The Limits to Growth") even more applicable today, or do we need to add more labeled data and "Restart and Run All"?
...
From https://research.stlouisfed.org/useraccount/fredcast/faq/ :
> FREDcast™ is an interactive forecasting game in which players make forecasts for four economic releases: GDP, inflation, employment, and unemployment. All forecasts are for the current month—or current quarter in the case of GDP. Forecasts must be submitted by the 20th of the current month. For real GDP growth, players submit a forecast for current-quarter GDP each month during the current quarter. Forecasts for each of the four variables are scored for accuracy, and a total monthly score is obtained from these scores. Scores for each monthly forecast are based on the magnitude of the forecast error. These monthly scores are weighted over time and accumulated to give an overall performance.
> Higher scores reflect greater accuracy over time. Past months' performances are downweighted so that more-recent performance plays a larger part in the scoring.
The #GobalGoals Targets and Indicators may be our best set of variables to optimize for from 2015 through 2030; I suppose all of them are economic.
Using predictive models for policy is not new, in fact it was the standard approach long before more inferential models, and the famed Lucas critique precisely targets a primitive approach similar to what you are proposing.
The issue is the following: In economics, one is interested in an underlying parameter of a complex equilibrium system (or, if you wish, a non-equilibrium complex system of multi-agentic behavior). This may be, for example, some pricing parameter for a given firm - say - how your sold units react to setting a price.
Economics faces two basic issues:
First, any predictive model (like a NN or simple regression) that takes price as an input, will not correctly estimate the sensitivity of revenue to price. It is actually usually the case, that the inference is reversed.
A model where price is input, and sold units or revenue is output (or vice-versa) will predict (you can check that using pretty much any dataset of prices and outputs) that higher prices lead to higher outputs, because that is the association in the data. Of course we know that in truth, prices and outputs are co-determined. They are simultaneous phenomena, and regressing one on the other is not sufficient to "causally identify" the correct effect.
This is independent of how sophisticated your model is otherwise. Fitting a better non-linear representation does not help.
The solution is of course to reduce down these "endogenous" phenomena to their basic ingredients. Say you have cost data, and some demand parameters. Then, using a regression model (or NN) to predict the vector of endogenous outcome variables will work, and roughly give you the right inference.
Then, as a firm, you are able to use these (more) exogenous predictive variables to find your correct pricing.
This is not new, pops up everywhere in social science, is the basis of a gigantic literature called econometrics, and really has nothing to do with how you do the prediction.
The only thing that NN add are better predictions (better fitting) and the ability to deal with more data. As this inferential problem shows, using more (and more fine-grained) data is indeed crucial to predicting what a firm should do.
BUT, it is crucial to understand and reason about the underlying causality FIRST, because otherwise even the most sophisticated statistical approach will simply give you wrong results.
Secondly, the counterfactual data for economic issues is usually very scarce. The approach taken by machine learning is problematic, not only because of potentially wrong inference, but also because two points in time may simply not be based on comparable data-generating processes.
In fact, this is exactly the blindness that led to people missing the financial crisis. Of course, with enough data, and long enough samples, one should expect to be become pretty good at predicting economic outcomes. But experience has shown that in economics, these data are simply too scarce. The unobserved variation between two quarters, two years, two countries, two firms (etc.) is simply very large and has fat tails. This leads to spontaneous breakdowns of such predicitive models.
Taking these two issues together, we see that better non-linear function approximation is not the solution to our problems. Instead, it is a methodological improvement that must be used in conjunction with what we have learned about causality.
Indeed the literature moves into a different direction. Good economic science nowadays means to identify effects via natural experiments and other exogenous shifts that can plausibly show causality.
Of course such experiments are more rare, and more difficult, the larger the scale becomes. Which is why Macroeconomics is arguably the "worst science" in economics, while things like auctions and microstructure of markets are actually surprisingly good science (nowadays).
Doors are wide open for ML techniques, but really only to the point that they are useful in operationalizing more and better data.
Anyone trying to understand economic phenomena needs to be keenly aware of how inference can be done, which requires an understanding (or an approach to) - that is, a theory - of the underlying mechanisms.
Yes, some combination of variables/features grouped and connected with operators that correlate to an optima (some of which are parameters we can specify) that occurs immediately or after a period of lag during which other variables of the given complex system are dangerously assumed to remain constant.
> In fact, this is exactly the blindness that led to people missing the financial crisis
ML was not necessary to recognize the yield curve inversion as a strongly predictive signal correlating to subsequent contraction.
An NN can certainly learn to predict according to the presence or magnitude of a yield curve inversion and which combinations of other features.
- [ ] Exercise: Learning this and other predictive signals by cherry-picking data and hand-optimizing features may be an extremely appropriate exercise.
"This field is different because it's nonlinear, very complex, there are unquantified and/or uncollected human factors, and temporal"
Maybe we're not in agreement about whether AI and ML can do causal inference just as well if not better than humans manipulating symbols with human cognition and physical world intuition. The time is nigh!
In general, while skepticism and caution are appropriate, many fields suffer from a degree of hubris which prevents them from truly embracing stronger AI in their problem domain. (A human person cannot mutate symbol trees and validate with shuffled and split test data all night long)
> Anyone trying to understand economic phenomena needs to be keenly aware of how inference can be done, which requires an understanding (or an approach to) - that is, a theory - of the underlying mechanisms.
I read this as "must be biased by the literature and willing to disregard an unacceptable error term"; but also caution against rationalizing blind findings which can easily be rationalized as logical due to any number of cognitive biases.
Compared to AI, we're not too rigorous about inductive or deductive inference; we simply store generalizations about human behavior and predict according to syntheses of activations in our human NNs.
If you're suggesting that the information theory that underlies AI and ML is insufficient to learn what we humans have learned in a few hundred years of observing and attempting to optimize, I must disagree (regardless of the hardness or softness of the given complex field). Beyond a few combinations/scenarios, our puny little brains are no match for our department's new willing AI scientist.
> ML was not necessary to recognize the yield curve inversion as a strongly predictive signal correlating to subsequent contraction.
> An NN can certainly learn to predict according to the presence or magnitude of a yield curve inversion and which combinations of other features.
> - [ ] Exercise: Learning this and other predictive signals by cherry-picking data and hand-optimizing features may be an extremely appropriate exercise.
If the financial crisis has not yet occurred, how will the NN learn a relationship that does not exist in the data?
The exercise of cherry picking data and hand-optimizing is equivalent to applying theory to your statistical problem. It is what is required if you lack data points - using ML or otherwise. Nevertheless, we (as in humans) are bad at it. Speaking of the financial crisis. It was not AI's that picked up on it, it was some guys applying sophisticated and deep understanding of causal relationships. And that so few people did this, shows how bad we humans are at doing this implicitly and automatically by just looking at data!
> Maybe we're not in agreement about whether AI and ML can do causal inference just as well if not better than humans manipulating symbols with human cognition and physical world intuition. The time is nigh! In general, while skepticism and caution are appropriate, many fields suffer from a degree of hubris which prevents them from truly embracing stronger AI in their problem domain. (A human person cannot mutate symbol trees and validate with shuffled and split test data all night long)
ML and AI certainly can do causal inference. But then you have to do causal inference. Again, prediction on historical data is not equivalent to causal analysis, and neither is backtesting or validation. At the end of the day, AI and ML improves on predictions, but the distinction of causal analysis is a qualitative one.
> I read this as "must be biased by the literature and willing to disregard an unacceptable error term"; but also caution against rationalizing blind findings which can easily be rationalized as logical due to any number of cognitive biases.
No. My point is that for causal analysis, you have to leverage assumptions that are beyond your data set. Where these come from is besides the point. You will always employ a theory, implicitly or explicitly.
The major issue is not the we use theories, but rather that we might do it implicitly, hiding the assumptions about the DGP that allows causal inference. This is where humans are bad. Theories are just theories. With precise assumptions giving us causal identification, we are in a good position to argue where we stand.
If we just run algorithms without really understand what is going on, we are just repeating the mistakes from the last forty years!
> If you're suggesting that the information theory that underlies AI and ML is insufficient to learn what we humans have learned in a few hundred years of observing and attempting to optimize, I must disagree (regardless of the hardness or softness of the given complex field). Beyond a few combinations/scenarios, our puny little brains are no match for our department's new willing AI scientist.
All the information theory I have seen in any of the Machine Learning textbooks I have picked up is methodologically equivalent to statistics. In particular, the standard textbooks (Elements, Murhpy etc.) treatment of information theory would only allow causal identification under the exact same conditions that the statistics literature treats.
I fail to see the difference, or what AI in particular adds. The issue of causal inference is a "hot topic" in many fields, including AI, but the underlying philosophical issues are not exactly new. This includes information theory.
You seem to think that ML has somehow solved this problem. From my reading of these books, I certainly disagree. Causal inference is certainly POSSIBLE - just as in statistics, but ML doesn't give it to you for free!
In particular, note the following issue: To show causal identification, you need to make assumptions on your DGP (exogenous variation, timing, graphical causal relations ... whatever). Even if these assumptions are very implicit, they do exist. Just by looking at data, and running a model, you do not get causal inference. It can not be done "within" the system/model.
If you bake these things into your AI, then it, too makes these assumptions. There really is no difference. For example, I could imagine an AI that can identify likely exogenous variations in the data and use them to predict counterfactuals. That's probably not too far off, if it doesn't exist already. But, this is still based on the assumption that these variations are, indeed exogenous, which can never be proven within the DGP.
In contrast, I find that most "AI scientists" care very much about prediction, and very little about causal inference. I don't mean this subfield doesn't exist. But it is a subfield. In contrast, for many non-AI scientists, causal inference IS the fundamental question, and prediction is only an afterthought. ML in practice involved doing correct experiments (AB testing), at best. It will sooner or later also adopt all other causal inference techniques. But, my point stands, I have yet to see what ML adds. Enlighten me!
AI, ML and stats will merge, if they haven't already. The distinction will disappear. I believe the issues will not. I employ a lot of AI/ML techniques in my scientific work. Never have they solved the underlying issue of causal inference for me!
> AI, ML and stats will merge, if they haven't already. The distinction will disappear. I believe the issues will not.
All tools are misapplied; including economics professionals and their advice.
Here's a beautiful Venn diagram of "Colliding Web Sciences" which includes economics as a partially independent category: https://www.google.com/search?q=colliding+web+sciences&tbm=i...
A causal model is a predictive model. We must validate the error of a causal model.
Why are theoretic models hand-wavy? "That's just because noise, the model is correct." No, such a model is insufficient to predict changes in dependent variables when in the presence of noise; which is always the case. How does validating a causal model differ from validating a predictive model with historical and future data?
Yield-curve inversion as a signal can be learned by human and artificial NNs. Period. There are a few false positives in historical data: indeed, describe the variance due to "noise" by searching for additional causal and correlative relations in additional datasets.
I searched for "python causal inference" and found a few resources on the first page of search results: https://www.google.com/search?q=python+causal+inference
CausalInference: https://pypi.org/project/CausalInference/
DoWhy: https://github.com/microsoft/dowhy
CausalImpact (Python port of the R package): https://github.com/dafiti/causalimpact
"What is the best Python package for causal inference?" https://www.quora.com/What-is-the-best-Python-package-for-ca...
Search: graphical model "information theory" [causal] https://www.google.com/search?q=graphical+model+%22informati...
Search: opencog causal inference https://www.google.com/search?q=opencog+causal+inference (MOSES, PLN,)
If you were to write a pseudocode algorithm for an econometric researcher's process of causal inference (and also their cognitive processes (as executed in a NN with a topology)), how would that read?
(Edit) Something about the sufficiency of RL (Reinforcement Learning) for controlling cybernetic systems. https://en.wikipedia.org/wiki/Cybernetics
What's the point of dumping a bunch of Google results here? At least half the results are about implementations of pretty traditional etatistical / econometric inference techniques. The Rudin causal inference framework requires either randomized controlled trials or for propensity score models an essentially unverifiable separate model step.
Google's CausalImpact model, despite having been featured on Google's AI blog, is a statistical/econometric model (essentially the same as https://www.jstor.org/stable/2981553). It leaves it up to the user to find and designate a set of control variables, which has to be designated by the user to be unaffected by the treatment. This is not done algorithmically, and has very little to do with RNNs, Random Forests or regression regularization.
> If you were to write a pseudocode algorithm for an econometric researcher's process of causal inference (and also their cognitive processes (as executed in a NN with a topology)), how would that read?
[1] Set up a proper RCT, that is randomly assign the treatment to different subjects [2] Calculate the outcome diffences between the treated and untreated
For A/B testing your website, the work division between [1] and [2] might be 50-50, or at least at similar order of magnitudes.
For the questions that academic economists wrstle with, say, estimate the effect of increasing school funding / decreasing class size, the effect of shifts between tax deductions vs tax credits vs changing tax rates or bands, or of the different outcome on GDP growth and unemployment of monetary vs fiscal expansion [1] would be 99.9999% of the work, or completely impossible.
Faced with the impracticallity/impossiblility of proper experiments, academic micro-economists have typically resorted to Instrumental Variable regressions. AFAICT finding (or rather, convincing the audience that you have) a proper instrument is not very amendable to automation or data mining.
In academic macro-economics (and hence at Serious Institutions such as central banks and the IMF), the most popular approaches to building causal models in the last 3 or 4 decades have probably been 1) making a bunch of unrealistic assumpsions of the behaviour individual agents (microfoundations/DSGE models) 2) making a bunch of uninterpretable and unverifyable technical assumptions on the parameters in a generic dynamic stochastic vector process fitted to macro-aggregates (Structural VAR with "identifying restrictions") 3) manually grouping different events in different countries from different periods in history as "similar enough" to support your pet theory: lowering interest rates can lead to a) high inflation, high unemployment (USA 1970s), b) high inflation, low unemployment (Japan 1970s), b) low inflation, high unemployment (EU 2010s) c) low inflation, low unemployment (USA, Japan past 2010s)
I really don't see how a RL would help with any of this. Care to come up with something concrete?
> What's the point of dumping a bunch of Google results here? At least half the results are about implementations of pretty traditional etatistical / econometric inference techniques.
Here are some tools for causal inference (and a process for finding projects to contribute to instead of arguing about insufficiency of AI/ML for our very special problem domain here). At least one AGI implementation doesn't need to do causal inference in order to predict the outcomes of actions in a noisy field.
Weather forecasting models don't / don't need to do causal inference.
> A/B testing
Is multi-armed bandit feasible for the domain? Or, in practice, are there too many concurrent changes in variables to have any sort of a controlled experiment. Then, aren't you trying to do causal inference with mostly observational data.
> I really don't see how a RL would help with any of this. Care to come up with something concrete?
The practice of developing models and continuing on with them when they seem to fit and citations or impact reinforce is very much entirely an exercise in RL. This is a control system with a feedback loop. A "Cybernetic system". It's not unique. It's not too hard for symbolic or neural AI/ML. Stronger AI can or could do [causal] inference.
I am at loss at what you want to say to me, but let me reiterate:
Any learning model by itself is a statistical model. Statistical models are never automatically causal models, albeit causal models are statistical models.
Several causal models can be observationally equivalent to a single statistical model, but the substantive (inferential) implications on doing "an intervention" on the DGP differ.
It is therefore not enough to validate and run on a model on data. Several causal models WILL validate on the same data, but their implications are drastically different. The data ALONE provides you no way to differentiate (we say, identify) the correct causal model without any further restrictions.
By extension, it is impossible for any ML mechanism to predict unobserved interventions without being a causal model.
ML and AI models CAN be causal models, which is the case if they are based on further assumptions about the DGP. For example, they may be graphical models, SCM/SEM etc. These restrictions can be derived algorithmically, based on all sorts of data, tuning, coding and whatever. It really doesn't change the distinction between causal and statistical analysis.
The way these models become causal is based on assumptions that constitute a theory in the scientific sense. These theories can then of course also be validated. But this is not based on learning from historical data alone. You always have to impose sufficient restrictions on your model (e.g. the DGP) to make such causal inference.
This is not new, but for your benefit, I basically transferred the above from an AI/ML book on causal analysis.
AI/ML can do causal analysis, because its statistics. AI/ML are not separate from these issues, do not solve these issues ex-ante, are not "better" than other techniques except on the dimensions that they are better as statistical techniques, AND, most importantly, causal application necessarily implies a theory.
Whether this is implicit or explicit is up to the researcher, but there are dangers associated with implicit causal reasoning.
And as Pearl wrote (who is not a fan of econometrics by any means!), the issue of causal inference was FIRST raised by econometricians BASED on combining the structure of economic models with statistical inference. In the 1940's.
I mean I get the appeal to trash talk social sciences, but when it comes to causal inference, you probably picked exactly the wrong one.
You are free to disregard economic theory. But you can not claim to do causal analysis without any theory. Doing so implicitly is dangerous. Furthermore, you are wrong in the sense that economic theory has put causal inference issues at the forefront of econometric research, and is therefore good for science even if you dislike those theories.
And by the way, I can come up with a good number of (drastic, hypothetical) policy interventions that would break your inference about a market crash - an inference you only were able to make once you saw such a market crash at least once.
If this dependence is broken, your non-causal model will no longer work, because the relationship between yield curve and market crash is not a physical constant fact. What you did to make it a causal inference is implicitly assume a theory about how markets work (e.g. - as they do right now -) and that it will stay this way. Actually, you did a lot more, but that's enough.
Now, you and me, we can both agree that your model with yield curves is good enough. We could even agree that you would have found it before the financial crashes, and are a billionaire. But the commonality we agree upon is a context that defines a theory.
Some alien that has been analyzing financial systems all across the universe may disagree, saying that your statistical model is in fact highly sensitive to Earth's political, societal and natural context.
Such is the difficulty of causal analysis.
> By extension, it is impossible for any ML mechanism to predict unobserved interventions without being a causal model.
In lieu of a causal model, when I ask an economist what they think is going to happen and they aren't aware of any historical data - there is no observational data collected following the given combination of variables we'd call an event or an intervention - is it causal inference that they're doing in their head? (With their NN)
> Now, you and me, we can both agree that your model with yield curves is good enough.
Yield curves alone are insufficient due to the rate of false positives. (See: ROC curves for model evalutation just like everyone else)
> We could even agree that you would have found it before the financial crashes,
The given signal was disregarded as a false positive by the appointed individuals at the time; why?
> Some alien that has been analyzing financial systems all across the universe may disagree,
You're going to run out of clean water and energy, and people will be willing to pay for unhealthy sugar water and energy-inefficient transaction networks with a perception of greater security.
That we need Martian scientist as an approach is, IMHO, necessary because of our learned biases; where we've inferred relations that have been reinforced which cloud our assessment of new and novel solutions.
> Such is the difficulty of causal analysis.
What a helpful discussion. Thanks for explaining all of this to me.
Now, I need to go write my own definitions for counterfactual and DGP and include graphical models in there somewhere.
A further hint, here is a great book about causal analysis from a ML/AI perspective
https://mitpress.mit.edu/books/elements-causal-inference
I feel like you will benefit from reading this!
It's free!
> In lieu of a causal model, when I ask an economist what they think is going to happen and they aren't aware of any historical data - there is no observational data collected following the given combination of variables we'd call an event or an intervention - is it causal inference that they're doing in their head? (With their NN)
It's up for debate if NN's represent what is going on in our heads. But let's for a moment assume it is so.
Then indeed, an economist leverages a big set of data and assumptions about causal connections to speculate how this intervention would change the DGP (the modules in the causal model) and therefore how the result would change.
An AI could potentially do the same (if that is really what we humans do), but so far, we certainly lack the ability to program such a general AI. The reason is, in part, because we have difficulty creating causal AI models even for specialized problems. In that sense, humans are much more sophisticated right now.
It is important to note that such a hypothetical AI would create a theory, based on all sorts of data, analogies, prior research and so forth, just like economists do.
It does not really matter if a scientist, or an AI, does the theorizing. The distinction is between causal and non-causal analysis.
The value of formal theory is to lay down assumptions and tautological statements that leave no doubt about what the theory is. If we see that the theory is wrong, because we disagree on the assumptions, this is actually very good and speaks for the theory. Lot's of social sciences is plagued by "general theories" that can never really shown to be false ex ante. And given that theories can never be empirically "proven", only validated in the statistical sense, this leads to a many parallel theories of doubtful value. Take a gander into sociology if you want to see this in action.
Secondly, and this is very important, is that we learn from models. This is not often recognized. What we learn from writing down models is how mechanics or modules interact. These interactions, highly logical, are USUALLY much less doubtful than the prior assumptions. For example, if price and revenues are equilibrium phenomena, we LEARN from the model that we CAN NOT estimate them with a standard regression model!
This is exactly what lead to causal analysis in this case, because earlier we would literally regress price on quantity or production on price etc. and be happy about it. But the results were often even in the entirely wrong direction!
Instead, looking at the theory, we understood the mechanical intricacies of the process we supposedly modeled, and saw that we estimated something completely different than what we interpreted. Causal analysis, among other things, tackles this issue by asking "what it is really that we estimate here?".
> Hand-wavy theory - predicated upon physical-world models of equillibrium which are themselves classical and incomplete - without validation is preferable to empirical models? Please.
My friend, you are strawmanning.
I said,
> What we do need is proper models that are then validated, which don't necessarily need 'big data.'
Which agrees with you. I said we need both and, not one or the other.
> If the model does not fit all of the big data, the error term is higher; regardless of whether the model was pulled out of a hat in front of a captive audience or deduced though inference from actual data fed through an unbiased analysis pipeline.
Big data without a model is still only valid for the scenario the data were collected in
> If the 'black-box predictive model' has lower error for all available data, the task is then to reverse the model! Not to argue for unvalidated theory.
Certainly, but we should simultaneously recognize that any model so conceived is still only valid in the situations the data were collected in, which makes the not necessarily useful for the future. You could turn such an equation into an economic philosophy, but you'd have to do a lot more, non metric, work.
How can you possibly be arguing that we should not be testing models with all available data?
All models are limited by the data they're trained from; regardless of whether they are derived through rigorous, standardized, unbiased analysis or though laudable divine inspiration.
From https://news.ycombinator.com/item?id=19084622 :
> pandas-datareader can pull data from e.g. FRED, Eurostat, Quandl, World Bank: https://pandas-datareader.readthedocs.io/en/latest/remote_da...
> pandaSDMX can pull SDMX data from e.g. ECB, Eurostat, ILO, IMF, OECD, UNSD, UNESCO, World Bank; with requests-cache for caching data requests: https://pandasdmx.readthedocs.io/en/latest/#supported-data-p...
> All models are limited by the data they're trained from; regardless of whether they are derived through rigorous, standardized, unbiased analysis or though laudable divine inspiration.
Some of the data we have isn't training data. Purely data-driven models tend to be ensnared by Goodhart's law.
For example, suppose we're issuing 30-year term loans and we have some data that shows that people with things like country club memberships and foie gras on their credit card statements have a higher tendency not to miss payments. So we use that information to make our determination.
But people are aware we're doing this and the same data is externally available, so now people start to waste resources on extravagant luxuries in order to qualify for a loan or a low interest rate, and that only makes it more likely that they ultimately default. However, that consequence doesn't become part of the data set until years have passed and the defaults actually occur, and in the meantime we're using flawed reasoning to issue loans. When we finally figure that out after ten years, the new data we use then will have some fresh different criteria for people to game, because the data is always from the past rather than the future.
We've already seen the kind of damage this can do. Politicians see data that college-educated people are better off so subsidize college loans, only to discover that the signal from having a degree that caused it to result in such gainful employment is diluted as it becomes more common, and subsidizing loans results in price inflation, and making a degree a prerequisite for jobs that shouldn't require it creates incentives for degree mills that pump out credentials but not real education.
To get out of this we have to consider not only what people have done in the past but how they are likely to respond to a given policy change, for which we have no historical data prior to when the policy is enacted, and so we need to make those predictions based on logic in addition to data or we go astray.
> To get out of this we have to consider not only what people have done in the past but how they are likely to respond to a given policy change, for which we have no historical data prior to when the policy is enacted, and so we need to make those predictions based on logic in addition to data or we go astray.
"Pete, it's a fool who looks for logic in the chambers of the human heart."
Logically, we might have said "prohibition will reduce substance abuse harms" but the actual data indicates that margins increased. Then, we look at the success of Portugal's decriminalization efforts and cannot at all validate our logical models.
Similarly, we might've logically claimed that "deregulation of the financial industry will help everyone" or "lowering taxes will help everyone" and the data does not support.
So, while I share the concerns about Responsible AI and encoding biases (and second-order effects of making policy recommendations according to non-causal models without critically, logically thinking first) I am very skeptical about our ability to deduce causal relations without e.g. blind, randomized, longitudinal, interventional studies (which are unfortunately basically impossible to do with [economic] policy because there is no "ceteris paribus")
https://personalmba.com/second-order-effects/
"Causal Inference Book" https://news.ycombinator.com/item?id=17504366
> https://www.hsph.harvard.edu/miguel-hernan/causal-inference-...
> Causal inference (Causal reasoning) https://en.wikipedia.org/wiki/Causal_inference ( https://en.wikipedia.org/wiki/Causal_reasoning )
The virtue of logic isn't that your model is always correct or that it should be adhered to without modification despite contrary evidence, it's that it allows you to have one to begin with. It's a method of choosing which experiments to conduct. If you think prohibition will reduce substance abuse but then you try it and it doesn't, well, you were wrong, so end prohibition.
This is also a strong argument for "laboratories of democracy" and local control -- if everybody agrees what to do then there is no dispute, but if they don't then let each local region have their own choice, and then we get to see what happens. It allows more experiments to be run at once. Then in the worst case the damage of doing the wrong thing is limited to a smaller area than having the same wrong policy be set nationally or internationally, and in the best case different choices are good in different ways and we get more local diversity.
> If you think prohibition will reduce substance abuse but then you try it and it doesn't, well, you were wrong, so end prohibition.
Maybe we're at a local optima, though. Maybe this is a sign that we should just double down, surge on in there and get the job done by continuing to do the same thing and expecting different results. Maybe it's not the spec but the implementation.
Recommend a play according to all available data, and logic.
> This is also a strong argument for "laboratories of democracy" and local control -- if everybody agrees what to do then there is no dispute, but if they don't then let each local region have their own choice, and then we get to see what happens. It allows more experiments to be run at once. Then in the worst case the damage of doing the wrong thing is limited to a smaller area than having the same wrong policy be set nationally or internationally, and in the best case different choices are good in different ways and we get more local diversity.
"Adjusting for other factors," the analysis began.
- [ ] Exercise / procedure to be coded: Brainstorm and identify [non-independent] features that may create a more predictive model (a model with a lower error term). Search for confounding variables outside of the given data.
The New York Times course to teach its reporters data skills is now open-source
It's more work to verify all formulas that reference unnamed variables in a spreadsheet than to review the code inputs and outputs in a notebook.
"Teaching Pandas and Jupyter to Northwestern journalism students" [in DC] https://www.californiacivicdata.org/2017/06/07/dc-python-not...
> http://www.firstpythonnotebook.org/
You can also develop d3.js visualizations — just like NYT — with jupyter notebooks and whichever language(s).
"Data-Driven Journalism" ("ddj") https://en.wikipedia.org/wiki/Data-driven_journalism
http://datadrivenjournalism.net/
"The Data Journalism Handbook 1" https://datajournalism.com/read/handbook/one
"The Data Journalism Handbook 2" https://datajournalism.com/read/handbook/two
While there are a number of ScholarlyArticle journals that can publish notebooks, I'm not aware of any newspapers that are prepared to publish notebooks as NewsArticles. It's pretty easy to `jupyter convert --to html` and `--to markdown` or just 'Save as'
Regarding expressing facts as verifiable claims with structured data in HTML and/or blockchains: "Fact Checks" https://news.ycombinator.com/item?id=15529140
Does this course recommend linking to every source dataset and/or including full citations (with DOI) in the article? Does this course recommend getting a free DOI for the published revision of an e.g. GitHub project repository (containing data, and notebooks and/or the article text) with Zenodo?
No Kings: How Do You Make Good Decisions Efficiently in a Flat Organization?
Group decision-making > Formal systems: https://en.wikipedia.org/wiki/Group_decision-making#Formal_s...
> Consensus decision-making, Voting-based methods, Delphi method, Dotmocracy
Consensus decision-making: https://en.wikipedia.org/wiki/Consensus_decision-making
There's a field that some people are all calling "Collaboration Engineering". I learned about this from a university course in Collaboration.
6 Patterns of Collaboration [GRCOEB] — Generate, Reduce, Clarify, Organize, Evaluate, Build Consensus
7 Layers of Collaboration [GPrAPTeToS] — Goals, Products, Activities, Patterns of Collaboration, Techniques, Tools, Scripts
The group decision making processes described in the article may already be defined with the thinkLets design pattern language.
A person could argue against humming for various unspecified reasons.
I'll just CC this here from my notes, which everyone can read here [1]:
“Collaboration Engineering: Foundations and Opportunities” de Vreede (2009) http://aisel.aisnet.org/jais/vol10/iss3/7/
“A Seven-Layer Model of Collaboration: Separation of Concerns for Designers of Collaboration Systems” Briggs (2009) http://aisel.aisnet.org/icis2009/26/
Six Patterns of Collaboration “Defining Key Concepts for Collaboration Engineering” Briggs (2006) http://aisel.aisnet.org/amcis2006/17/
“ThinkLets: Achieving Predictable, Repeatable Patterns of Group Interaction with Group Support Systems (GSS)” http://www.academia.edu/259943/ThinkLets_Achieving_Predictab...
https://scholar.google.com/scholar?q=thinklets
[1] https://wrdrd.github.io/docs/consulting/team-building#collab...
4 Years of College, $0 in Debt: How Some Countries Make Education Affordable
It at least makes sense to pay for doctors and nurses to go to school, right? If you want to care for others and you do the work to earn satisfactory grades, I think that investing in your education would have positive ROI.
We had plans here in the US to pay for two years of community college for whoever ("America's College Promise"). IDK what happened to that? We should have called it #ObamaCollege so that everyone could attack corporate welfare and bad investments with no ROI.
New York has the Excelsior scholarship for CUNY and SUNY. Tennessee pays for college with lottery proceeds. Are there other state-level efforts to fund higher education in the US such that students can finish school debt-free or close to it?
There are MOOCs (online courses) which are worth credit hours for the percentage of people that commit to finishing the course. https://www.classcentral.com/
Khan Academy has free SAT, MCAT, NCLEX-RN, GMAT, and LSAT test prep and primary and supplementary learning resources. https://www.khanacademy.org/test-prep
Free education: https://en.wikipedia.org/wiki/Free_education
Ask HN: What jobs can a software engineer take to tackle climate change?
I'm a software engineer with a diverse background in backend, frontend development.
How do I find jobs related to tackling global warming and climate change in Europe for an English speaker?
Open to ideas and thoughts.
> I'm a software engineer with a diverse background in backend, frontend development.
> How do I find jobs related to tackling global warming and climate change in Europe for an English speaker?
While not directly answering the question, here are some ideas for purchasing, donating, creating new positions, and hiring people that care:
Write more efficient code. Write more efficient compilers. Optimize interpretation and compilation so that the code written by people with domain knowledge who aren't that great at programming who are trying to solve other important problems is more efficient.
Push for PPAs (Power Purchase Agreements) that offset energy use. Push for directly sourcing clean energy.
Use services that at least have 100% PPAs for the energy they use: services that run on clean energy sources.
Choose green datacenters.
- [ ] Add the capability for cloud resource schedulers like Kubernetes and Terraform to prefer or require clean energy datacenters.
Choose to work with companies that voluntarily choose to do sustainability reporting.
Work to help develop (and popularize) blockchain solutions that are more energy efficient and that have equal or better security assurances as less efficient chains.
Advocate for clean energy. Donate to NGOs working for our environment and for clean energy.
Invest in clean energy. There are a number of clean energy ETFs, for example. Better energy storage is a good investment.
Push for certified green buildings and datacenters.
- [ ] We should create some sort of a badge and structured data (JSONLD, RDFa, Microdata) for site headers and/or footers that lets consumers know that we're working toward '200% green' so that we can vote with our money.
Do not vote for people who are rolling back regulations that protect our environment. Pay an organization that pays lobbyists to work the system: that's the game.
Help explain why it's both environment-rational and cost-rational to align with national and international environmental sustainability and clean energy objectives.
Argue that we should make external costs internal in order that markets will optimize for what we actually want.
Thermodynamics is part of the physics curriculum for many software engineering and computer science degrees.
There are a number of existing solutions that solve for energy inefficiency due to unreclaimed waste heat.
"Thermodynamics of Computation Wiki" https://news.ycombinator.com/item?id=18146854
"Why Do Computers Use So Much Energy?" https://news.ycombinator.com/item?id=18139654
YC's request for startups: Government 2.0
There's money to be earned in solving for the #GlobalGoals Goals, Targets, and Indicators:
The Global Goals
1. No Poverty
2. Zero Hunger
3. Good Health & Well-Being
4. Quality Education
5. Gender Equality
6. Clean Water & Sanitation
7. Affordable & Clean Energy
8. Decent Work & Economic Growth
9. Industry, Innovation & Infrastructure
10. Reduced Inequalities
11. Sustainable Cities and Communities
12. Responsible Consumption & Production
13. Climate Action
14. Life Below Water
15. Life on Land
16. Peace and Justice & Strong Institutions
17. Partnerships for the Goals
Almost 40% of Americans Would Struggle to Cover a $400 Emergency
I always wonder what proportion of that group is due to insufficient income, and what proportion is due to terrible financial literacy.
> I always wonder what proportion of that group is due to insufficient income
According to the Social Security Administration [1]:
2017 Average net compensation: 48,251.57
2017 Median net compensation: 31,561.49
The FPL (Federal Poverty Level) income numbers for Medicaid and the Children's Health Insurance Program (CHIP) eligibility [2]:
>> $12,140 for individuals, $16,460 for a family of 2, $20,780 for a family of 3, $25,100 for a family of 4, $29,420 for a family of 5, $33,740 for a family of 6, $38,060 for a family of 7, $42,380 for a family of 8
Wages are not keeping up with corporate profits. That can't all be due to automation.
The minimum wage is only one factor linked to price inflation. We can raise wages and still keep inflation down to an ideal range.
Maybe it's that we don't understand what it's like to live on $12K or $32K a year (without healthcare due to lack of Medicaid expansion; due to our collective failure to instill charity as a virtue and getting people back on their feet as a good investment). How could we learn (or remember!) about what it's like to be in this position (without zero-interest bank loans to bail us out)?
> and what proportion is due to terrible financial literacy.
The r/personalfinance wiki is one good resource for personal finance. From [3]:
>> Personal Finance (budgets, interest, growth, inflation, retirement)
Personal Finance https://en.wikipedia.org/wiki/Personal_finance
Khan Academy > College, careers, and more > Personal finance https://www.khanacademy.org/college-careers-more/personal-fi...
"CS 007: Personal Finance For Engineers" https://cs007.blog
https://reddit.com/r/personalfinance/wiki
... How can we make personal finance a required middle and high school curriculum component? [4]
"What are some ways that you can save money in order to meet or exceed inflation?"
Dave Ramsey's 7 Baby Steps to financial freedom [5] seem like good advice? Is the debt snowball method ideal for minimizing interest payments?
[1] https://www.ssa.gov/OACT/COLA/central.html
[2] https://www.healthcare.gov/glossary/federal-poverty-level-fp...
[3] "Ask HN: How can you save money while living on poverty level?" https://news.ycombinator.com/item?id=18894582
[4] "Consumer science (a.k.a. home economics) as a college major" https://news.ycombinator.com/item?id=17894632
Congress should grow the Digital Services budget, it more than pays for itself
> The U.S. Digital Service isn’t perfect, but it is clearly working. The team estimates that for every $1 million invested in USDS that the government will avoid spending $5 million and save thousands of labor hours. Over a five-year period, the team’s efforts will save $1.1 billion, redirect almost 2,000 labor years towards higher value work, and generate over 400 percent return on investment. Most importantly, USDS will continue to deliver better government services for the American people, including Veterans who deserve better.
> In the private sector, these kinds of numbers would not lead to a 50 percent cut in budget. Instead, you’d clearly invest further with that kind of return. Considering the ambitious goals set out in the President’s Management Agenda, the Trump Administration should double down on better support for the public, our troops, and our veterans. The best way to do that is clearly through investments like USDS.
Why would you halve the budget of a team that's yielding a more than 400% ROI (in terms of cost savings)?
https://en.wikipedia.org/wiki/United_States_Digital_Service
400% ROI to whom?
More data makes hiding corruption harder, so I'm not sure if it really gives 400% to the decision makers.
USDS reports 400% ROI in savings to the taxpayers who fund the government with tax revenue (instead of kicking the can down the road with debt financing) and improvements in customer service quality.
https://www.usaspending.gov (Federal Funding Accountability and Transparency Act of 2006 (Obama, McCain, Carper, Coburn)) has more fine-grained spending data, but not credit-free immutable distributed ledger transaction IDs, quantitative ROI stats, or performance.gov and #globalgoals goal alignment. We'd need a metadata field on spending bills to link to performance.gov and SDG Goals, Targets, and Indicators.
"Transparency and Accountability"
IIRC, here on HN, I've mentioned a number of times -- and quoted in the full from -- the 13 plays of the USDS Digital Services Playbook; all of which are applicable to and should probably be required reading for all government IT and govtech: https://playbook.cio.gov/
There are forms with workflow states that need human review sometimes. USDS helps with getting those processes online in order to reduce costs, increase cost-efficiency, and increase quality of service.
The Trillion-Dollar Annual Interest Payment
> Given the recent actions of Congress, and the years of prior inaction in changing the nation’s fiscal path, the U.S. government’s annual interest payment will eclipse annual defense spending in only six years. By 2025, annual interest costs on the national debt will reach $724 billion, while annual defense spending will reach $706 billion. To put that into perspective, in the 2018 fiscal year, the U.S. government spent $325 billion in interest payments and spent $622 billion in defense (Exhibit 2).
Why would you cut taxes and debt finance our nation's future?
Oak, a Free and Open Certificate Transparency Log
Great use case for blockchain technology
CT logs are already chained
> Great use case for blockchain technology
>> CT logs are already chained
Trillian is a centralized Merkle tree: it doesn't support native replication (AFAIU?) and there is a still a password that can delete or recreate the chain (though we can track for any such inappropriate or errant modifications (due to e.g. solar flares) by manually replicating and verifying every entry in the chain, or trusting that everything before whatever we consider to be a known hash (that could be colliding) is unmodified (since the last time we never verified those entries)).
According to the trillian README, trillian depends upon MySQL/MariaDB and thus internal/private replication is as good as the SQL replication model (which doesn't have a distributed consensus algorithm like e.g. paxos).
A Merkle tree alone is not a blockchain; though it provides more assurance of data integrity than a regular tree, verifying that the whole chain of hashes actually is good and distributed replication without configuring e.g. SSL certs are primary features of blockchains.
There are multiple certificate issuers, multiple logs, and multiple log verifiers. With no single point of failure, that doesn't sound centralized to me?
Which components of the system are we discussing?
PKI is necessarily centralized: certs depend upon CA certs which can depend upon CA certs. If any CA is compromised (e.g. by theft or brute force (which is inestimably infeasible given current ASIC resources' preference for legit income)) that CA can sign any CRL. A CT log and a CT log verifier can help us discover that a redundant and so possibly unauthorized cert has been issued for a given domain listed in an x.509 cert CN/SAN.
The CT log itself - trillian, for Google and now LetsEncrypt, too - though, runs on MySQL; which has one root password.
The system of multiple independent, redundant CT logs is built upon databases that depend upon presumably manually configured replication keys.
Does my browser call a remote log verifier API over (hopefully pinned with a better fingerprint than MD5) HTTPS?
There are multiple issuers, so from an availability point of view, if one is down, you could choose another. They submit to at least two logs, so if one log is unavailable you could read the other one. This is a form of decentralization.
Now, from a security point of view, it only takes breaking into one issuer to issue bad certificates. But maybe classifying everything as either centralized or decentralized is too simple?
Centralized and decentralized are overloaded terms. We could argue that every system that depends upon DNS is a centralized (and thus has a single point of failure).
We could describe replication models as centralized or decentralized. Master/master SQL replication is still not decentralized (regardless of whether there are multiple A records or multiple static IPs configured in the client).
With PKI, we choose the convenience of trusting a CA bundle over having to manually check every cert fingerprint.
Whether a particular chain is centralized or decentralized is often bandied about. When there are a few mining pools that effectively choose which changes are accepted, that's not decentralized either.
That there are multiple redundant independent CT logs is a good thing.
How do I, as a concerned user, securely download (and securely mirror?) one or all of the CT logs and verify that none of the record hashes don't depend upon the previous hash? If the browser relies upon a centralized API for checking hash fingerprints, how is that decentralized?
Looks like there is a bit here about how to get started: https://security.stackexchange.com/questions/167366/how-can-...
Most people aren't going to do it, but I think that's not really the point, any more than every user needs to review Linux kernel patches. But I wonder if there are enough "eyes" on this and how would we check?
Death rates from energy production per TWh
Apparently the deaths are justified because energy.
Are the subsidies and taxes (incentives and penalties) rational in light of the relative harms of each form of energy?
"Study: U.S. Fossil Fuel Subsidies Exceed Pentagon Spending" https://www.rollingstone.com/politics/politics-news/fossil-f...
> The IMF found that direct and indirect subsidies for coal, oil and gas in the U.S. reached $649 billion in 2015. Pentagon spending that same year was $599 billion.
> The study defines “subsidy” very broadly, as many economists do. It accounts for the “differences between actual consumer fuel prices and how much consumers would pay if prices fully reflected supply costs plus the taxes needed to reflect environmental costs” and other damage, including premature deaths from air pollution.
IDK whether they've included the costs of responding to requests for help with natural disasters that are more probable due to climate change caused by these "externalties" / "external costs" of fossil fuels.
Energy saves lives so some risk probably is justified.
Why isn't the market choosing the least harmful, least lethal energy sources? Energy is for the most part entirely substitutable: switching costs for consumers like hospitals are basically zero.
(Everyone is free to invest in clean energy at any time)
Switching costs for the entire society are far from zero.
100% Renewable Energy https://en.wikipedia.org/wiki/100%25_renewable_energy
> The main barriers to the widespread implementation of large-scale renewable energy and low-carbon energy strategies are political rather than technological. According to the 2013 Post Carbon Pathways report, which reviewed many international studies, the key roadblocks are: climate change denial, the fossil fuels lobby, political inaction, unsustainable energy consumption, outdated energy infrastructure, and financial constraints.
We need to make the external costs of energy production internal in order to create incentives to prevent these fossil fuel deaths and other costs.
Use links not keys to represent relationships in APIs
A thing may be identified by a URI (/person/123) for which there are zero or more URL routes (/person/123, /v1/person/123). Each additional route complicates caching; redirects are cheap for the server but slower for clients.
JSONLD does define a standard way to indicate that a value is a link: @id (which can be specified in a/an @context) https://www.w3.org/TR/json-ld11/
One additional downside to storing URIs instead of bare references is that it's more complicated to validate a URI template than a simple regex like \d+ or [abcdef\=\d+]+
And on a more fundamental standpoint I get that disk space is cheap theses days. But you just doubled, if not worse, the storage space required for a key for a vague reason.
It may never make any difference on a small dataset, where storage was anyway unaware of differences between integer and text. But it would be hiding in the dark. And maybe in a few year a new recruit will have the outstanding idea to convert text-link to bigint to save some space...
No Python in Red Hat Linux 8?
/usr/bin/python can point to either /usr/bin/python3 or (as PEP 394 currently recommends) /usr/bin/python2
$ alternatives --config python
FWIU, there are ubi8/python-27 and ubi8/python-36 docker images. IDK if they set /usr/bin/python out of the box? Changing existing shebangs may not be practical for some applications (which will need to specify 'python4' whenever that occurs over the next 10 supported years of RHEL/CENTOS 8)JMAP: A modern, open email protocol
What are the optimizations in JMAP that make it faster than, say, Solid? Solid is built on a bunch of W3C Web, Security, and Linked Data Standards; LDP: Linked Data Protocol, JSON-LD: JSON Linked Data, WebID-TLS, REST, WebSockets, LDN: Linked Data Notifications. [1][2] Different worlds, I suppose.
There's no reason you couldn't represent RFC5322 data with RDF as JSONLD. There's now a way to do streaming JSON-LD.
LDP does paging and querying.
Solid supports pubsub with WebSockets and LDN. It may or may not (yet?) be as efficient for synchronization as JMAP, but it's definitely designed for all types of objects with linked data web standards; and client APIs can just parse JSON-LD.
[1] https://github.com/solid/information#solid-specifications
[2] https://github.com/solid/solid-spec/issues/123 "WebSockets and HTTP/2" SSE (Server-Side Events)
JMAP: JSON Meta Application Protocol https://en.wikipedia.org/wiki/JSON_Meta_Application_Protocol
Is there a OpenAPI Specification for JMAP? There are a bunch of tools for Swagger / OpenAPIs: DRY interactive API docs, server implementations, code generators: https://swagger.io/tools/open-source/ https://openapi.tools/
Does JMAP support labels; such that I don't need to download a message and an attachment and mark it as read twice like labels over IMAP?
How does this integrate with webauthn; is that a different layer?
(edit) Other email things: openpgpjs; Web Key Directory /.well-known/openpgpkey/*; if there's no webserver on the MX domain, you can use the ACME DNS challenge to get free 3-month certs from LetsEncrypt.
I would say your comment, and the first reply, both demonstrate quite effectively why JMAP is probably a better choice for email.
> It may or may not (yet?) be as efficient for synchronization as JMAP, but it's definitely designed for all types of objects
If we hypothetically allow for equal adoption & mindshare of both, and assume both are non-terrible designs, I'd guess the one designed for "all types of objects" is less likely to ever be as efficient as the one designed with a single use-case in mind.
And narrow focus is not only good for optimising specific use-cases, it's also good for adoption as people immediately understand what your protocol is for and how to use it when it has a single purpose and a single source of truth for reference spec, rather than a series of disparate links and vague all-encompassing use-cases.
Solid has brilliant people behind it, but it's too broad, too ambitious, and very much lacks focus, and that will impair adoption because it isn't the "one solution" for anyone's "one problem".
--
To take another perspective on this, there are other commenters in this thread bemoaning the loss of non-HTTP-based protocols. Funnily enough, HTTP itself is a broadly used, broadly useful protocol than can be used for pretty much anything (and had TBL behind it also). The big difference was that Tim wasn't proposing that HTTP be the solution to all our internet problems and needs in 1989—it was just for hypertext documents. It's only now, post-adoption, that it is used for so much more than that.
> If we hypothetically allow for equal adoption & mindshare of both, and assume both are non-terrible designs, I'd guess the one designed for "all types of objects" is less likely to ever be as efficient as the one designed with a single use-case in mind.
This is a generalization that is not supported by any data.
Standards enable competing solutions. Competing solutions often result in performance gains and efficiency.
Hopefully, there will be performant implementations and we won't need to reinvent the wheel in order to synchronize and send notifications for email, contacts, and calendars.
> There's no reason you couldn't represent RFC5322 data with RDF as JSONLD. There's now a way to do streaming JSON-LD.
Is there a reason you'd want to? I clicked all your links but I still have no idea what Solid is.
To eliminate the need for domain-specific parser implementations on both server and client, make it easy to index and search this structured data, and to link things with URIs and URLs like other web applications that also make lots of copies.
Solid is a platform for decentralized linked data storage and retrieval with access controls, notifications, WebID + OAuth/OpenID. The Wikipedia link and spec documents have a more complete description that could be retrieved and stored locally.
Grid Optimization Competition
From "California grid data is live – solar developers take note" https://news.ycombinator.com/item?id=18855820 :
>> It looks like California is at least two generations of technology ahead of other states. Let’s hope the rest of us catch up, so that we have a grid that can make an asset out of every building, every battery, and every solar system.
> +1. Are there any other states with similar grid data available for optimization; or any plans to require or voluntarily offer such a useful capability?
How do these competitions and the live actual data from California-only (so far; AFAIU) compare?
Are there standards for this grid data yet? Without standards, how generalizable are the competition solutions to real-world data?
Blockchain's present opportunity: data interchange standardization
What are the current standards efforts for blockchain data interchange?
W3C JSON-LD, ld-signatures + lds-merkleproof2017 (normalize the data before signing it so that the signature is representation-independent (JSONLD, RDFa, RDF, n-triples)), W3C DID Decentralized Identifiers, W3C Verifiable Claims, Blockcerts.org
W3C Credentials Community Group: https://w3c-ccg.github.io/community/work_items.html#draft-sp... (DID, Multihash (IETF), [...])
"Blockchain Credential Resources; a gist" https://gist.github.com/westurner/4345987bb29fca700f52163c33...
Specifically for payments:
https://www.w3.org/TR/?title=payment (the W3C Payment Request API standardizes browser UI payment/checkout workflows)
ILP: Interledger Protocol https://interledger.org/rfcs/0027-interledger-protocol-4/
> W3C JSON-LD
https://www.w3.org/TR/json-ld/ (JSON-LD 1.0)
https://www.w3.org/TR/json-ld11/ (JSON-LD 1.1)
> ld-signatures + lds-merkleproof2017 (normalize the data before signing it so that the signature is representation-independent (JSONLD, RDFa, RDF, n-triples))
https://w3c-dvcg.github.io/ld-signatures/
https://w3c-dvcg.github.io/lds-merkleproof2017/ (2017 Merkle Proof Linked Data Signature Suite)
> W3C DID Decentralized Identifiers
https://w3c-ccg.github.io/did-primer/
>> A Decentralized Identifier (DID) is a new type of identifier that is globally unique, resolveable with high availability, and cryptographically verifiable. DIDs are typically associated with cryptographic material, such as public keys, and service endpoints, for establishing secure communication channels. DIDs are useful for any application that benefits from self-administered, cryptographically verifiable identifiers such as personal identifiers, organizational identifiers, and identifiers for Internet of Things scenarios. For example, current commercial deployments of W3C Verifiable Credentials heavily utilize Decentralized Identifiers to identify people, organizations, and things and to achieve a number of security and privacy-protecting guarantees.
> W3C Verifiable Claims
https://github.com/w3c/verifiable-claims
https://w3c.github.io/vc-data-model/ (Data Model)
https://w3c.github.io/vc-use-cases/ (Use Cases: Education, Healthcare, Professional Credentials, Legal Identity,)
Ask HN: Value of “Shares of Stock options” when joining a startup
I got an offer from a US start-up (well +25 employees) which has an office in EU where I would join them.
The offer's base salary is good (ie. higher than average for senior positions for that location) but I intend to negotiate it further, as I have possible other options. patio11's negotiation guide was a great read in that regard.
However, I'm relocating from a non-EU/US country, and I don't have a single idea about the financial systems, stock markets, and how to evaluate "15k shares of stock-options" or what "Stock Option and Grant Plan" means, I'm asking you fellow HNers about this part.
Do I just treat them as worthless and focus on base salary (as some internet sources suggest) or is there a formula to evaluate what they would be worth in say 2 years for instance ?
There are a number of options/equity calculators:
https://tldroptions.io/ ("~65% of companies will never exit", "~15% of companies will have low exits*", "~20% of companies will make you money")
https://comp.data.frontapp.com/ "Compensation and Equity Calculator"
http://optionsworth.com/ "What are my options worth?"
http://foundrs.com/ "Co-Founder Equity Calculator"
CMU Computer Systems: Self-Grading Lab Assignments (2018)
These look fun; in particular the "Attack Lab".
Dockerfiles might be helpful and easy to keep updated. Alpine Linux or just busybox are probably sufficient?
The instructor set could extend FROM the assignment image and run a few tests with e.g. testinfra (pytest)
You can also test code written in C with gtest.
I haven't read through all of the materials: are there suggested (automated) fuzzing tools? Does OSS-Fuzz solve?
Are there references to CWE and/or the SEI CERT C Coding Standard rules? https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?c...
"How could we have changed our development process to catch these bugs/vulns before release?"
"If we have 100% [...] test coverage, would that mean we've prevented these vulns?"
What about 200%?
What on earth would 200% test coverage mean?
Coverage for all the code that was written, and all the code that has yet to be typed, of course.
All thanks to quantum computing :)
Show HN: Debugging-Friendly Tracebacks for Python
pytest also has helpful tracebacks; though only for test runs.
With nose-progressive, you can specify --progressive-editor or update the .noserc so that traceback filepaths are prefixed with your preferred editor command.
vim-unstack parses paths from stack traces / tracebacks (for a number of languages including Python) and opens each in a split at that line number. https://github.com/mattboehm/vim-unstack
Here's the Python regex from my hackish pytb2paths.sh script:
'\s+File "(?P<file>.*)", line (?P<lineno>\d+), in (?P<modulestr>.*)$'
https://github.com/westurner/dotfiles/blob/develop/scripts/p...Why isn't 1 a prime number?
Funny true story about this: One time a man came up to me and handed me his phone, and asked me to call his mom. He was about to pass out and he asked me not call an ambulance (God bless America) , he appeared to have a concussion. By this time there was about 10-15 people around him and he could barely talk. I asked him what his phone code was and instead of just giving it to me he said "it's the first four prime numbers". Immediately, about five people shout "1,2,3,5". I am no longer holding the phone, because I handed to someone else to make sure it was okay. Sure enough, I was in a mathematical proofs class and we had just discussed this topic. So, I say "one is not a prime number". Of course, we get the phone unlocked in the second try with "2,3,5,7" and the guys mom is on the way. Everyone thought I was a genius a hero.
You can also dial emergency contacts without unlocking the phone. They are accessible from the medical ID page on iOS, I assume Android has similar.
> You can also dial emergency contacts without unlocking the phone. They are accessible from the medical ID page on iOS, I assume Android has similar.
You can set a Lock Screen Message by searching for "Lock Screen Message" in the Android Settings.
You can also create an "ICE (In Case of Emergency)" contact.
How do we know when we’ve fallen in love? (2016)
Where is the feeling felt? Is there an associated color or a shape or a kinesiologic position?
"Love styles" https://en.wikipedia.org/wiki/Love_styles
"Greek words for love" https://en.wikipedia.org/wiki/Greek_words_for_love
Rare and strange ICD-10 codes
These are too funny: https://www.empr.com/home/features/the-strangest-and-most-ob...
Choice selection:
V91.07X - Burn Due to Water Skis on Fire(?!)
W61.42XA. Struck By Turkey, initial encounter. If a duck is involved, there's a code for that too. (W61.62X)
R46.1. Bizarre Personal Appearance.
My favorite: T63.012D: Toxic effect of rattlesnake venom, intentional self-harm, subsequent encounter
So basically you have a case where someone not only intentionally wanted to harm themselves with rattlesnake venom once, but at least twice!
> So basically you have a case where someone not only intentionally wanted to harm themselves with rattlesnake venom once, but at least twice!
No, you misunderstand the terminology. "Subsequent encounter" means with the doctor not with the rattlesnake. AKA followup care during or after recovery.
> No, you misunderstand the terminology. "Subsequent encounter" means with the doctor not with the rattlesnake
You can reference ICD codes with the schema.org/code property of schema.org/MedicalEntity and subclasses. https://schema.org/docs/meddocs.html
"Subsequent encounter" is poorly defined. IMHO, there should be a code for this.
> "Subsequent encounter" is poorly defined.
"Poorly defined" is poorly defined. Explanations of when to use the D make perfect sense to me.
"The 7th character for “subsequent encounter” is to be used for all encounters after the patient has received active treatment of the condition and is receiving routine care for the condition during the healing or recovery phase. Examples of subsequent encounters include cast change or removal, x-ray to check healing status of a fracture, removal of external or internal fixation device, medication adjustments, and other aftercare and follow-up visits following active treatment of the injury or condition. Encounters for rehabilitation, such as physical and occupational therapy, are another example of the use of the “subsequent encounter” 7th character. For aftercare following an injury, the acute injury code should be assigned with the 7th character for subsequent encounter."
This parses the ICD 10 CM with lxml: https://github.com/westurner/pycd10api/blob/master/pycd10api...
Python Requests III
So what's the difference to the predecessor?
asyncio, HTTP/2, connection pooling, timeouts, Python 3.6+
README > "Feature Support" https://github.com/kennethreitz/requests3/blob/master/README...
https://github.com/aio-libs/aiohttp already supports asyncio. Why do I need this for asyncio?
Anyway, I see that Issues are disabled for the repo. Is that the new way to develop? /s
Tools built with aiohttp: https://libraries.io/pypi/aiohttp/usage
Tools built with requests: https://libraries.io/pypi/requests/usage
Post-surgical deaths in Scotland drop by a third, attributed to a checklist
My former roommate is a pilot. When I first met him, I noticed that he uses checklists for just about everything, even the most basic everyday tasks.
After some time, I decided to apply that same mentality to my own life. Both in private and work situations.
I get it now. Checklists reduce cognitive load tremendously well, even for basic tasks. As an example: I have a checklist for when I need to travel, it contains stuff like what to pack, asking someone to feed my cat, check windows are closed, dishwasher empty, heating turned down, etc. Before the checklist, I would always be worried I forgot something, now I can relax.
Also, checklists are a great way to improve processes. Basically a way to debug your life. For instance: I once forgot to empty the trash bin before a long trip, I added that to my checklist and haven't had a smelly surprise ever since ;)
This is described in the book called "the checklist manifesto". Very good book by the way.
https://www.amazon.com/Checklist-Manifesto-How-Things-Right/...
The medical community uses checklists. They've been using checklists for a while. A number of studies found that they only help for a few months, while they're new.
That said, "the medical community" is not a homogeneous monolith, and you can absolutely find regional variation in what checklists are used for, how detailed they are, how closely they're followed, how people are accountable for keeping to them, etc.
"The Checklist Manifesto" chose to overlook the studies about how transient the benefit of checklists is.
Are these paper based? Any checklist apps out there for our daily use?
GitHub and GitLab support task checklists in Markdown and also project boards which add and remove labels like 'ready' and 'in progress' when cards are moved between board columns; like kanban:
- [ ] not complete
- [x] completed
Other tools support additional per-task workflow states:
- [o] open
- [x (2019-04-17)] completed on date
I worked on a large hospital internal software project where the task was to build a system for reusable checklists editable through the web that prints them out in duplicate or triplicate at nearby printers. People really liked having the tangible paper copy.
"The Checklist Manifesto" by Atul Gawande was published while I worked there. TIL pilots have been using checklists for process control in order to reduce error for many years.
Evernote, RememberTheMilk, Google Tasks, and Google Keep all support checklists. Asana and Gitea and TaskWarrior support task dependencies.
A person could carry around a Hipster PDA with Bullet Journal style tasks lists with checkboxes; printed from a GTD service with an API and a @media print CSS stylesheet: https://en.wikipedia.org/wiki/Hipster_PDA
I'm not aware of very many tools that support authoring reusable checklists with structured data elements and data validation.
...
There are a number of configuration management systems like Puppet, Chef, Salt, and Ansible that build a graph of completable and verifiable tasks and then depth-first traverse said graph (either with hash randomization resulting in sometimes different traversals or with source order as an implicit ordering)
Resource scheduling systems like operating systems and conference room schedulers can take ~task priority into account when optimally ordering tasks given available resources; like triage.
Scheduling algorithms: https://news.ycombinator.com/item?id=15267146
TodoMVC catalogs Todo list implementations with very many MV* JS Frameworks: http://todomvc.com
Reusable Checklists could be done with a simple text document that you duplicate every time you need it no ?
For sure. Though many tools don't read .txt (or .md/.markdown) files.
GitHub and GitLab support (multiple) Issue and Pull Request templates:
Default: /.github/ISSUE_TEMPLATE.md || Configure in web interface
/.github/ISSUE_TEMPLATE/Name.md || /.gitlab/issue_templates/Name.md
Default: /.github/PULL_REQUEST_TEMPLATE.md || Configure in web interface
/.github/PULL_REQUEST_TEMPLATE/Name.md || /.gitlab/merge_request_templates/Name.md
There are template templates in awesome-github-templates [1] and checklist template templates in github-issue-templates [2].
I really want slack to adopt these as well...
I'll often be on call with customer and create a checklist on MacOS Notes on the fly. Then will copy paste that in slack or github for simple tracking.
Mattermost supports threaded replies and Markdown with checklist checkboxes
You can post GitHub/GitLab project updates to a Slack/Mattermost channel with webhooks (and search for and display GH/GL issues with /slash commands); though issue edits and checkbox state changes aren't (yet?) included in the events that channels receive.
Apply to Y Combinator
The next Startup School course begins in July: https://www.startupschool.org/
> Learn how to start a startup with YC’s free 10-week online course.
> Here is a list of the top 100 Y Combinator companies by valuation (why valuation?), as of October 2018. We also included YC’s top 12 exits.
Here's the list of the 1,900 Y Combinator companies through Winter 2019 (W19) https://www.ycombinator.com/companies/
"Startup Playbook" by Sam Altman (YC Founder) and Illustrated by Gregory Koberger is also a good read: https://playbook.samaltman.com/
Trunk-Based Development vs. Git Flow
One major advantage of the gitflow/hubflow git workflows is that there is a standard way of merging across branches. For example, a 'hotfix' branch is merged into the stable master branch and also develop with one standard command; there's no need to re-explain and train new devs on how the branches were supposed to work here. I even copied the diagram(s) into my notes: https://westurner.github.io/tools/#hubflow
IMHO, `git log` on the stable master branch containing each and every tagged release is preferable to having multiple open release branches.
Requiring tests to pass before a PR gets merged is a good policy that's independent of the trunk or gitflow workflow decision.
Ask HN: Anyone else write the commit message before they start coding?
I feel like I just learned how to use Git: writing the message first thing has made me a lot more productive. I'm wondering if anyone else does this; I know test driven development is a thing, where people write tests before code, and this seems like a logical extension.
Ask HN: Datalog as the only language for web programming, logic and database
Can Datalog be used as the only language which we can use for writing server-side web application, complex domain business logic and database querying?
Are there any efforts made in this direction.
To quote myself from a post the other day https://news.ycombinator.com/item?id=19407170 :
> PyDatalog does Datalog (which is ~Prolog, but similar and very capable) logic programming with SQLAlchemy (and database indexes) and apparently NoSQL support. https://sites.google.com/site/pydatalog/
> Datalog: https://en.wikipedia.org/wiki/Datalog
> ... TBH, IDK about logic programming and bad facts. Resilience to incorrect and incredible information is - I suppose - a desirable feature of any learning system that reevaluates its learnings as additional and contradictory information makes its way into the datastores.
I'm not sure that Datalog is really necessary for most CRUD operations; SQLAlchemy and the SQLAlchemy ORM are generally sufficient for standard database querying CRUD.
The cortex is a neural network of neural networks
Is there a program like codeacademy but for learning sysadmin?
if not, anyone wanna build one?
A few sysadmin and devops curriculum resources; though none but Beaker and Molecule are interactive with any sort of testing AFAIU:
"System Administrator" https://en.wikipedia.org/wiki/System_administrator
"Software Configuration Management" (SCM) https://en.wikipedia.org/wiki/Software_configuration_managem...
"DevOps" https://en.wikipedia.org/wiki/DevOps
"OpsSchool Curriculum" http://www.opsschool.org
- Soft Skills 101, 201
- Labs Exercises
- Free. Contribute
awesome-sysadmin > configuration-management https://github.com/kahun/awesome-sysadmin/blob/master/README...
- This could list reusable module collections such as Puppet Forge and Ansible Galaxy;
- And module testing tools like Puppet Beaker and Ansible Molecule (that can use Vagrant or Docker to test a [set of] machines)
https://github.com/stack72/ops-books
- I'd add "Time Management for System Administrators" (2005)
https://landing.google.com/sre/books/
- There's now a "Site Reliability Workbook" to go along with the Google SRE book. Both are free online.
https://response.pagerduty.com
- The PagerDuty Incident Response Documentation is also free online.
- OpsGenie has a free plan also with incident response alerting and on-call management.
There are a number of awesome-devops lists.
Minikube and microk8s package Kubernetes into a nice bundle of distributed systems components that'll run on Lin, Mac, Win. You can convert docker-compose.yml configs to Kubernetes pods when you decide that it should've been HA with a load balancer SPOF and x.509 certs and a DRP (Disaster Recovery Plan) from the start!
Maybe You Don't Need Kubernetes
The argument that Kubernetes adds complexity is, in my opinion, bogus. Kubernetes is a "define once, forget about it" type of infrastructure. You define the state you want your infrastructure to and Kubernetes takes care of maintaining that state. Tools like Ansible and Puppet, as great as they are, do not guarantee your infrastructure will end up in the state you defined and you easily end up with broken services. The only complexity in kubernetes is the fact that it forces you to think and carefully design your infra in a way people aren't used to, yet. More upfront, careful thinking isn't complexity. It can only benefit you in the long run.
There is, however, a learning curve to Kubernetes, but it isn't this sharp. It does require you to sit down and read the doc for 8 hours, but that a small price to pay.
A few month back I wrote a blog post[1] that, through walking through the few different infrastructures my company experimented with over the years, surfaces many reasons one would want to use [a managed] Kubernetes. (For a shorter read, you can probably start at reading at [2])
[1]: https://boxunix.com/post/bare_metal_to_kube
[2]: https://boxunix.com/post/bare_metal_to_kube/#_hardware_infra...
It's pretty common for new technologies to advertise themselves as "adopt, and forget about it", but in my experience it's unheard of that any actually deliver on this promise.
Any technology you adopt today is a technology you're going to have to troubleshoot tomorrow. (I don't think the 15,000 Kubernetes questions on StackOverflow are all from initial setup.) I can't remember the last [application / service / file format / website / language / anything related to computer software] that was so simple and reliable that I wasn't searching the internet for answers (and banging my head against the wall because of) the very next month. It was probably something on my C=64.
As Kernighan said back in the 1970's, "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?" I've never used Kubernetes, but I've read some articles about it and watched some videos, and despite the nonstop bragging about its simplicity (red flag #1), I'm not sure I can figure out how to deploy with it. I'm fairly certain I wouldn't have any hope of fixing it when it breaks next month.
Hearing testimonials only from people who say "it doesn't break!" is red flag #2. No technology works perfectly for everyone, so I want to hear from the people who had to troubleshoot it, not the people who think it's all sunshine and rainbows. And those people are not kind, and make it sound like the cost is way more than just "8 hours reading the docs" -- in fact, the docs are often called out as part of the problem.
> As Kernighan said back in the 1970's, "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"
What a great quote. Thanks
Some more quotes taken from http://typicalprogrammer.com/what-does-code-readability-mean that you might find interesting.
Any fool can write code that a computer can understand. Good programmers write code that humans can understand. – Martin Fowler
Just because people tell you it can’t be done, that doesn’t necessarily mean that it can’t be done. It just means that they can’t do it. – Anders Hejlsberg
The true test of intelligence is not how much we know how to do, but how to behave when we don’t know what to do. – John Holt
Controlling complexity is the essence of computer programming. – Brian W. Kernighan
The most important property of a program is whether it accomplishes the intention of its user. – C.A.R. Hoare
No one in the brief history of computing has ever written a piece of perfect software. It’s unlikely that you’ll be the first. – Andy Hunt
> Tools like Ansible and Puppet, as great as they are, do not guarantee your infrastructure will end up in the state you defined and you easily end up with broken services.
False dilemma. Ansible and Puppet are great tools for configuring kubernetes, kubernetes worker nodes, and building container images.
Kubernetes does not solve for host OS maintenance; though there are a number of host OS projects which remove most of what they consider to be unnecessary services, there's still need to upgrade kubernetes nodes and move pods out of the way first (which can be done with e.g. Puppet or Ansible).
As well, it may not be appropriate for monitoring to depend upon kubernetes; there again you have nodes to manage with an SCM tool.
Quantum Machine Appears to Defy Universe’s Push for Disorder
Maybe an example or related to time crystals?
"Scar (physics)": https://en.wikipedia.org/wiki/Scar_(physics)
> Scars are unexpected in the sense that stationary classical distributions at the same energy are completely uniform in space with no special concentrations along periodic orbits, and quantum chaos theory of energy spectra gave no hint of their existence
Pytype checks and infers types for your Python code
How does pytype compare with the PyAnnotate [1] and MonkeyType [2] dynamic / runtime PEP-484 type annotation type inference tools?
How I'm able to take notes in mathematics lectures using LaTeX and Vim
As a feat in itself this is definitely very impressive, but I wonder if it's really worth anyone's time to spend precious lecture time with your mind fully occupied in the mechanical task of taking notes rather than actually absorbing and engaging with the content. Especially for an extremely content-dense subject like Mathematics, where you need all your concentration just to process what you are reading and follow along the logic.
There is absolutely no dearth of study material on the internet. It's one thing to take notes as part of your study process. There it helps solidify your understanding. But surely taking notes on the fly when you have been barely introduced to the subject isn't going to help with that.
> mechanical task of taking notes rather than actually absorbing and engaging with the content
The mechanical task of taking notes is one of the most important parts of actually absorbing the material. It is not an either-or. Hearing/seeing the information, processing it in a way that makes sense to you individually, and then mechanically writing it down in a legible manner is one of the main methods that your brain learns. It's one of the primary reasons that taking notes is important in the first place. This is referred to as the "encoding hypothesis" [1].
There are actually even studies [2] that show that tools that assist in more efficient note taking, such as taking notes via typing rather than by hand, are actually detrimental to absorbing information, as it makes it easier for you to effectively pass the information directly from your ears to your computer without actually doing the processing that is required when writing notes by hand. This is why many universities prefer (or even require) notes to be taken by hand, and disallow laptops in class.
1: https://www.sciencedirect.com/science/article/pii/0361476X78...
2: https://www.npr.org/2016/04/17/474525392/attention-students-...
> The mechanical task of taking notes is one of the most important parts of actually absorbing the material. It is not an either-or. Hearing/seeing the information, processing it in a way that makes sense to you individually, and then mechanically writing it down in a legible manner is one of the main methods that your brain learns. It's one of the primary reasons that taking notes is important in the first place. This is referred to as the "encoding hypothesis" [1].
There's almost certainly an advantage to learning to think about math using a publishable symbol set like LaTeX.
We learn by reinforcement; with feedback loops that may take until weeks later in a typical university course.
> There are actually even studies [2] that show that tools that assist in more efficient note taking, such as taking notes via typing rather than by hand, are actually detrimental to absorbing information, as it makes it easier for you to effectively pass the information directly from your ears to your computer without actually doing the processing that is required when writing notes by hand.
Handwriting notes is impractical for some people due to e.g. injury and illegibility.
The linked study regarding retention and handwritten versus typed notes has been debunked with references that are referenced elsewhere in comments on this post. There have been a few studies with insufficient controls (lack of randomization, for one) which have been widely repeated by educators who want to be given attention.
Doodling has been shown to increase information retention. Maybe doodling as a control really would be appropriate.
Banning laptops from lectures is not respectful of students with injury and illegible handwriting. Asking people to put their phones on silent (so they can still make and take emergency calls) and refrain from distracting other students with irrelevant content on their computers is reasonable and considerate.
(What a cool approach to math note-taking. I feel a bit inferior because I haven't committed to learning that valuable, helpful skill and so that's stupid and you're just wasting your time because that's not even necessary when all you need to do is retain the information you've paid for for the next few months at most. If course, once you get on the job, you'll never always be using that tool and e.g. latex2sympy to actually apply that theory to solving a problem that people are willing to pay for. So, thanks for the tips and kudos, idiot)
LHCb discovers matter-antimatter asymmetry in charm quarks
So, does this disprove all of supersymmetry? https://en.wikipedia.org/wiki/Supersymmetry
No, supersymmetry and charge-parity symmetry are different.
Ah, thanks.
"CPT Symmetry" https://en.wikipedia.org/wiki/CPT_symmetry
"CP Violations" https://en.wikipedia.org/wiki/CP_violation
"Charm quark" https://en.wikipedia.org/wiki/Charm_quark :
> The antiparticle of the charm quark is the charm antiquark (sometimes called anticharm quark or simply anticharm), which differs from it only in that some of its properties have equal magnitude but opposite sign.
React Router v5
Using react router on one of my personal projects (~ 30k loc) was probably one of my largest regrets. At every turn it seemed designed to do the thing I wouldn’t expect, or have arbitrary restrictions that made my life tougher.
Some examples:
* there's no relative routes https://github.com/ReactTraining/react-router/issues/2172
* there's no way to refresh the page https://github.com/ReactTraining/react-router/issues/1982 ("that's your responsibility, not ours")
* The scroll position will stick when you navigate to a new route, causing you to need to create a custom component wrapper to manage scrolling https://github.com/ReactTraining/react-router/issues/3950
* React router's <link> do not allow you to link outside of the current site https://github.com/ReactTraining/react-router/issues/1147 so you have to make an in those cases. This doesn’t sound bad, but it’s particularly frustrating when dealing with dynamic or user generated content. Why can’t they handle this simple case?
> there's no relative routes
> React router's <link> do not allow you to link outside of the current site
Both valid points. Not show stoppers though. And not enough to make me regret using the library. The solution to the external links issue is a one-liner.
> there's no way to refresh the page https://github.com/ReactTraining/react-router/issues/1982 ("that's your responsibility, not ours")
That is your responsibility. Why do you need the routing library to handle page refreshing for you?
> the scroll position will stick when you navigate to a new route
Making the page scroll to the top when navigating to a new page is trivial. I would 100% rather have this problem instead of the opposite: scroll jumping to the top when I don't want it to. That's so much harder to fix.
> Navigating to the same page would not actually reload the page, it would just trigger a componentDidMount() on all components in the page again, which led me to have a lot of bugs when I did some initialization in my constructor
That's exactly what it's supposed to do. It's a client side routing solution. (I'm also pretty sure that it doesn't remount)
From the issues that you've had with the library, it seems like client side routing is not actually what you're looking for. If you regret using it so much, may I ask what the alternative would be?
IMO a project either needs client side routing, or it doesn't. If it does, then React Router is the obvious choice. Otherwise, of course, don't use it and save yourself from unnecessary complexity.
> The solution to the external links issue is a one-liner.
It is most definitely not a one-liner.
> Making the page scroll to the top when navigating to a new page is trivial.
So do it for me. There’s no reason a routing library should break default behavior.
> That's exactly what it's supposed to do. It's a client side routing solution. (I'm also pretty sure that it doesn't remount)
Then it’s “supposed” to have inconsistent behavior. Navigating anywhere else in my site will call constructors to all my components. Navigating to the same page won’t.
> From the issues that you've had with the library, it seems like client side routing is not actually what you're looking for.
My site was a music site. I wanted clientside routing so I could keep music playing while you navigated between links. Seemed like a slam dunk for clientside routing to me.
> may I ask what the alternative would be?
Quite honestly I would go onto GitHub and look for any routing solution that didn’t have multiple closed unfixed issues with hundreds of thumbs up.
> So do it for me.
I think this will work:
Put this wherever you put your reusable functions:
`const scrollToTop = () => document.getElementById('root').scrollIntoView();`
And put this in the components / functions you want the scrollToTop effect to work on:
`useEffect(() => { scrollToTop() }, []);`
And put this in your CSS:
`html { scroll-behavior: smooth }`
(Edit to add: Not saying React Router is something you should / shouldn’t use. Just wanted to share that code in case it helps unblock anyone.)
Accidentally downvoted on mobile (and upvoted two others). Thanks for this.
"Scroll Restoration" https://reacttraining.com/react-router/web/guides/scroll-res...
Experimental rejection of observer-independence in the quantum world
Objective truth!? A question for epistemologists to decide.
How could they record their high entropy (?) solipsistic observations in an immutable datastore in such as way as to have probably zero knowledge of the other party's observations?
Anyways, that's why I only read the title and the abstract.
Wigner's friend experiment: https://en.wikipedia.org/wiki/Wigner%27s_friend
Show HN: A simple Prolog Interpreter written in a few lines of Python 3
Cool tests! PyDatalog does Datalog (which is ~Prolog, but similar and very capable) logic programming with SQLAlchemy (and database indexes) and apparently NoSQL support. https://sites.google.com/site/pydatalog/
Datalog: https://en.wikipedia.org/wiki/Datalog
... TBH, IDK about logic programming and bad facts. Resilience to incorrect and incredible information is - I suppose - a desirable feature of any learning system that reevaluates its learnings as additional and contradictory information makes its way into the datastores.
Thanks for the feedback :) I'll definitely check out Datalog - I didn't realize they had logic programming integrated with SQLAlchemy, so it definitely sounds interesting!
How to earn your macroeconomics and finance white belt as a software developer
Thanks for the wealth of resources in this post. Here are a few more:
"Python for Finance: Analyze Big Financial Data" (2014, 2018) https://g.co/kgs/qkY8J6 ... https://pyalgo.tpq.io also includes the "Finance with Python" course and this book as a PDF and Jupyter notebooks.
Quantopian put out a call for the best Value Investing algos (implemented in quantopian/zipline) awhile back. This post links to those and other value investing resources: https://westurner.github.io/hnlog/#comment-19181453 (Ctrl-F "econo")
"Lectures in Quantitative Economics as Python and Julia Notebooks" https://news.ycombinator.com/item?id=19083479 links to these excellent lectures and a number of tools for working with actual data from FRED, ECB, Eurostat, ILO, IMF, OECD, UNSD, UNESCO, World Bank, Quandl.
One thing that many finance majors, courses, and resources often fail to identify is the role that startup and small businesses play in economic growth and actual value creation: jobs, GDP, return on direct capital investment. Most do not succeed, but it is possible to do better than index funds and have far more impact in terms of sustainable investment than as an owner of a nearly-sure-bet index fund that owns some shares and takes a hands-off approach to business management, research, product development, and operations.
Is it possible to possess a comprehensive understanding of finance and economics but still not have personal finance down? Personal finance: r/personalfinance/wiki, "Consumer science (a.k.a. home economics) as a college major" https://news.ycombinator.com/item?id=17894632
Ask HN: Relationship between set theory and category theory
I have an idea about the relationship between set theory and category theory and I would like some feedback. I would like others to see it too, and I don't know how to do it. I think it's at least interesting to look at as a slightly crazy collage, but I was a bit more excited than normal when the idea hit, so I just had to dump it all at once in this image: https://twitter.com/FamilialRhino/status/1101777965724168193 (You will have to zoom the picture in order to be able to read the scribbles.)
It has to do with resonance in the energy flowing in emergent networks. Can't quite put my finger on it, so I'll be here to answer any questions.
Thanks for reading.
"Categorical set theory" > "References" https://en.wikipedia.org/wiki/Categorical_set_theory#Referen...
From "Homotopy category" > "Concrete categories" https://en.wikipedia.org/wiki/Homotopy_category#Concrete_cat... :
> While the objects of a homotopy category are sets (with additional structure), the morphisms are not actual functions between them, but rather a classes of functions (in the naive homotopy category) or "zigzags" of functions (in the homotopy category). Indeed, Freyd showed that neither the naive homotopy category of pointed spaces nor the homotopy category of pointed spaces is a concrete category. That is, there is no faithful functor from these categories to the category of sets.
My "understanding" of category theory is extremely shallow, but that's exactly why I think my proposal makes sense. It is the kind of thing that everybody ignores for decades precisely because it's transparently obvious, like a fish that doesn't understand water.
Here is the statement:
The meaning of no category is every category.
reference: https://terrytao.wordpress.com/2008/02/05/the-blue-eyed-isla...
This was already understood by everybody in the field, no doubt. It's just that somebody has to actually say it to someone else in order for the symmetry to break. The link above has the exact description of this, from Terence Tao.
The most popular docker images each contain at least 30 vulnerabilities
Although vulnerability scanners can be a useful tool, I find it very troublesome that you can utter the sentence "this package contains XX vulnerabilities, and that package contains YY vulnerabilities" and then stop talking. You've provided barely any useful information!
The quantity of vulnerabilities in an image is not really all that useful information. A large amount of vulnerabilities in a Docker image does not necessarily imply that there's anything insecure going on. Many people don't realize that a vulnerability is usually defined as "has a CVE security advisory", and that CVEs get assigned based on a worst-case evaluation of the bug. As a result, having a CVE in your container barely tells you anything about your actual vulnerability position. In fact, most of the time you will find that having a CVE in some random utility doesn't matter. Most CVEs in system packages don't apply to most of your containers' threat models.
Why not? Because an attacker is very unlikely to be able to use vulnerabilities in these system libraries or utilities. Those utilities are usually not in active use in the first place. Even if they are used, you are not usually in a position to exploit these vulnerabilities as an attacker.
Just as an example, a hypothetical outdated version of grep in one of these containers can hypothetically contain many CVEs. But if your Docker service doesn't use grep, then you would need to manually run grep to be vulnerable. And an attacker that is able to run grep in your Docker container has already owned you - it doesn't make a difference that your grep is vulnerable! This hypothetical vulnerable version of grep therefore makes no difference in the security of your container, despite containing many CVEs.
It's the quality of these vulnerabilities that matters. Can an attacker actually exploit the vulnerabilities to do bad things? The answer for almost all of these CVEs is "no". But that's not really the product that Snyk sells - Snyk sells a product to show you as many vulnerabilities as possible. Any vulnerability scanner company thinks it can provide most business value (and make the most money) by reporting as many vulnerabilities as it can. For sure it can help you to pinpoint those few vulnerabilities that are exploitable, but that's where your own analysis comes in.
I'm not saying there's not a lot to improve in terms of container security. There's a whole bunch to improve there. But focusing on quantities like "amount of CVEs in an image" is not the solution - it's marketing.
This entire depending on a base container for setup and then essentially throwing away everything you get from the package manager is part of the issue. There is no real package management for Docker. Hell, there isn't even an official way to determine if your image needs an upgrade (I wrote a ruby script that gets the latest tags from a docker repo, extracts the ones with numbers and sorts them in order and compares them to what I have running).
Relying on an Alpine/Debian/Ubuntu base helps to get dependencies installed quickly. Docker could have just created their own base distro and some mechanism to track package updates across images, but they did not.
There are guides for making bare containers, they contain nothing .. no ip, grep, bash .. only the bare minimum libraries and requirements to run your service. They are minimal, but incredibly difficult to debug (sysdig still sucks unless you shell out money for enterprise).
I feel like containers are alright, but Docker is a partial dumpster fire. cgroup isolation is good, the crazy way we deal with packages in container systems is not so good.
Sure if you're just checking for base-distro packages for security vulnerabilities, you're going to find security issues that don't apply (e.g. an exploit in libpng even though your container runs nothing that even links to libpng), but it does excuse the whole issue with the way containers are constructed.
I think this space is really open too, for people to find better systems that are also portable: image formats that are easy to build, easy to maintain dependencies for, and easy to run as FreeBSD jails OR Linux cgroup managed containers (Docker for FreeBSD translated images to jails, but it's been unmaintained for years).
I agree the tooling is a tire fire :(
> e.g. an exploit in libpng even though your container runs nothing that even links to libpng
it's a problem because some services or api's could be abused giving the attacker a path to these vulnerable resources and then use that vulnerability regardless if it is used currently by your service.
I like my images to only contain what's absolutely needed to do that job only. It's not so difficult to do, provided people would be willing to architect systems from the ground up, instead of pulling in a complete debian or fedora installation and then removing things (that should be outlawed imho lol). Not only do I get less attack surface but also smaller updates (which again then is incentive to do more often), less complexity, less logs, easier auditabile (now every log file or even log line might give valid clues), faster incident response, easier troubleshooting, sorry for going on and on ...
It's a cultural problem too: where people working in an environment where it's normal to have every command available on a production system (really?), and where there is no barrier to install anything new that is "required" without discussion & peer review (what are we pair programming for?) or where nobody even tracks the dead weight in production or whether configs are locked down?.
I sometimes think many companies lost control over this long ago. [citation needed] :(
I'm not familiar with Docker infrastructure but what is the alternative to "pulling in a complete debian or fedora installation and then removing things"? Compiling your own kernel and doing the whole "Linux From Scratch" thing? Isn't that incredibly time-intensive to do for every single container?
Just have an image with very a minimal user land. Compiling your own kernel is irrelevant because you need the host kernel to run the container, and container images don't contain a kernel.
The busybox image is a good starting point. Take that, then copy your executables and libraries. If you are willing to go further, you can rather easily compile your own busybox with most utilities stripped out. It's not time intensive because you need to do it just once, and it takes just an afternoon to figure out how.
I don't think this is a tooling problem at all.
"The tooling makes it too easy to do it wrong." Compared to shell scripts with package manager invocations? Nobody configures a system with just packages: there are always scripts to call, chroots to create, users and groups to create, passwords to set, firewall policies to update, etc.
There are a bunch of ways to create LXC containers: shell scripts, Docker, ansible. Shell scripts preceded Docker: you can write a function to stop, create an intermediate tarball, and then proceed (so that you don't have to run e.g. debootstrap without a mirror every time you manually test your system build script; so that you can cache build steps that completed successfully).
With Docker images, the correct thing to do is to extend FROM the image you want to use, build the whole thing yourself, and then tag and store your image in a container repository. Neither should you rely upon months-old liveCD images.
"You should just build containers on busybox." So, no package management? A whole ensemble of custom builds to manually maintain (with no AppArmor or SELinux labels)? Maintainers may prefer for distros to field bug reports for their own common build configurations and known-good package sets. Please don't run as root in a container ("because it's only a container that'll get restarted someday"). Busybox is not a sufficient OS distribution.
It's not the tools, it's how people are choosing to use them. They can, could, and should try and use idempotent package management tasks within their container build scripts; but they don't and that's not Bash/Ash/POSIX's fault either.
> With Docker images, the correct thing to do is to extend FROM the image you want to use, build the whole thing yourself, and then tag and store your image in a container repository. Neither should you rely upon months-old liveCD images.
This should rebuild all. There should be an e.g. `apt-get upgrade -y && rm -rf /var/lib/apt/lists` in there somewhere (because base images are usually not totally current (and neither are install ISOs)).
`docker build --no-cache --pull`
You should check that each Dockerfile extends FROM `tag:latest` or the latest version of the tag that you support. Its' not magical, you do have to work it.
Also, IMHO, Docker SHOULD NOT create another Linux distribution.
Tinycoin: A small, horrible cryptocurrency in Python for educational purposes
The 'dumbcoin' jupyter notebook is also a good reference: "Dumbcoin - An educational python implementation of a bitcoin-like blockchain" https://nbviewer.jupyter.org/github/julienr/ipynb_playground...
When does the concept of equilibrium work in economics?
"Modeling stock return distributions with a quantum harmonic oscillator" (2018) https://iopscience.iop.org/article/10.1209/0295-5075/120/380...
> We propose a quantum harmonic oscillator as a model for the market force which draws a stock return from short-run fluctuations to the long-run equilibrium. The stochastic equation governing our model is transformed into a Schrödinger equation, the solution of which features "quantized" eigenfunctions. Consequently, stock returns follow a mixed χ distribution, which describes Gaussian and non-Gaussian features. Analyzing the Financial Times Stock Exchange (FTSE) All Share Index, we demonstrate that our model outperforms traditional stochastic process models, e.g., the geometric Brownian motion and the Heston model, with smaller fitting errors and better goodness-of-fit statistics. In addition, making use of analogy, we provide an economic rationale of the physics concepts such as the eigenstate, eigenenergy, and angular frequency, which sheds light on the relationship between finance and econophysics literature.
"Quantum harmonic oscillator" https://en.wikipedia.org/wiki/Quantum_harmonic_oscillator
The QuantEcon lectures have a few different multiple agent models:
"Rational Expectations Equilibrium" https://lectures.quantecon.org/py/rational_expectations.html
"Markov Perfect Equilibrium" https://lectures.quantecon.org/py/markov_perf.html
"Robust Markov Perfect Equilibrium" https://lectures.quantecon.org/py/rob_markov_perf.html
"Competitive Equilibria of Chang Model" https://lectures.quantecon.org/py/chang_ramsey.html
... "Lectures in Quantitative Economics as Python and Julia Notebooks" https://news.ycombinator.com/item?id=19083479 (data sources (pandas-datareader, pandaSDMX), tools, latex2sympy)
"Econophysics" https://en.wikipedia.org/wiki/Econophysics
> Indeed, as shown by Bruna Ingrao and Giorgio Israel, general equilibrium theory in economics is based on the physical concept of mechanical equilibrium.
Simdjson – Parsing Gigabytes of JSON per Second
> Requirements: […] A processor with AVX2 (i.e., Intel processors starting with the Haswell microarchitecture released 2013, and processors from AMD starting with the Rizen)
Also noteworthy that on Intel at least, using AVX/AVX2 reduces the frequency of the CPU for a while. It can even go below base clock.
iirc, it's complicated. Some instructions don't reduce the frequency; some reduce it a little; some reduce it a lot.
I'm not sure AVX2 is as ubiquitous as the README says: "We assume AVX2 support which is available in all recent mainstream x86 processors produced by AMD and Intel."
I guess "mainstream" is somewhat subjective, but some recent Chromebooks have Celeron processors with no AVX2:
https://us-store.acer.com/chromebook-14-cb3-431-c5fm
https://ark.intel.com/products/91831/Intel-Celeron-Processor...
Because someone wanting 2.2GB/s JSON parsing is deploying to a chromebook...
It doesn't seem that laughable to me to want faster JSON parsing on a Chromebook, given how heavily JSON is used to communicate between webservers and client-side Javascript.
"Faster" meaning faster than Chromebooks do now; 2.2 GB/s may simply be unachievable hardware-wise with these cheap processors. They're kinda slow, so any speed increase would be welcome.
AVX2 also incurs some pretty large penalties for switching between SSE and AVX2. Depending on the amount of time taken in the library between calls, it could be problematic.
This looks mostly applicable to server scenarios where the runtime environment is highly controlled.
There is no real penalty for switching between SSE and AVX2, unless you do it wrong. What are you referring to specifically?
Are you talking about state transition penalties that can occur if you forget a vzeroupper? That's the only thing I'm aware of which kind of matches that.
A faster, more efficient cryptocurrency
Full disclosure: I work on the cryptocurrency in this article, Algorand.
There are a lot of questions and speculation here about this paper and Algorand. I would be happy to try an answer them to your satisfaction. Some context may be helpful first, though. This paper is an innovation about one aspect of our technology. Algorand has a very fast consensus mechanism and can add blocks as quickly as the network can deliver them. We become a victim of our success. The blockchain will grow very rapidly. A terabyte a month is possible. The storage issue associated with our performance can quickly become an issue. The Vault paper is focused on solving this and other storage scaling problems.
The Algorand pure proof-of-stake blockchain and associated cryptocurrency has many novel innovations aside from Vault. It possesses security and scalability properties beyond what any other blockchain technology allows while still being completely decentralized. Our website, algorand.com, and whitepaper are great places to start to learn more.
If you learn best from videos then I suggest you watch Turing award winner and cryptographic pioneer, Silvio Micali, talk about Algorand: https://youtu.be/NykZ-ZSKkxM. He is a captivating speaker and the founder of Algorand.
Are there reasons that e.g. Bitcoin and Ethereum and Stellar could not implement some of these more performant approaches that Algorand [1] and Vault [2] have developed, published, and implemented? Which would require a hard fork?
[2] https://dspace.mit.edu/handle/1721.1/117821
My understanding is that PoS approaches follow normal byzantine agreement theory which states that adversaries cannot control more than 1/3rd of the accounts (or money in the case of algorand). You can also delay new blocks more easily.
Ethereum is scared or that so they are implementing some hybrid form.
Bitcoin is doomed from my perspective, because of the focus on proof of work and the confirmation times. When you realize that algorand is super fast, there is no "confirmation time", and there is no waste in energy to mine, then it is hard to back up any cryptocurrency focusing on proof of work.
And what of decentralized premined chains (with no PoW, no PoS, and far less energy use) that release coins with escrow smart contracts over time such as Ripple and Stellar (and close a new ledger every few seconds)?
> Algorand has a very fast consensus mechanism and can add blocks as quickly as the network can deliver them. We become a victim of our success. The blockchain will grow very rapidly. A terabyte a month is possible. The storage issue associated with our performance can quickly become an issue. The Vault paper is focused on solving this and other storage scaling problems.
What prevents a person from using a chain like IPFS?
Ethereum Casper PoS has been under review for quite some time.
Why isn't all Bitcoin on Lightning Network?
Bitcoin could make bootstrapping faster by choosing a considered-good blockhash and balances, but AFAIU, re-verifying transactions like Bitcoin and derivatives do prevents hash collision attacks that are currently considered infeasible for SHA-256 (especially given a low block size).
There was an analysis somewhere where they calculated the cloud server instance costs of mounting a ~51% attack (which applies to PoW chains) for various blockchains.
Bitcoin is not profitable to mine in places without heavily subsidized dirty/clean energy anymore: energy and Bitcoin commodity costs and prices have intersected. They'll need any of: inexpensive clean energy, more efficient chips, higher speculative value.
Energy arbitrage (grid-scale energy storage) may be more profitable now. We need energy storage in order to reach 100% renewable energy (regardless of floundering policy support).
Ripple is not decentralized. I don't know enough about Stellar to answer.
Bitcoin is software and can easily implement these features but the community is divided and can't reach consensus on anything. Lightning Network as layer two solution is pretty good from what I know.
Ethereum improvements are coming along very slowly and that's good. They're the only blockchain with active engagement by thousands of multiple parties.
Aragaon and Vault's papers might sound good, but who knows how they'll turn out in production.
People argue this all day. There's a lot of FUD.
Ripple only runs ~7% of validator nodes; which is far less centralized control than major Bitcoin mining pools and businesses (who do the deciding in regards to the many Bitcoin hard forks); that's one form of decentralization.
Ripple clients can use their own UNL or use the Ripple-approved UNL.
Ripple is traded on a number of exchanges (though fewer than Bitcoin for certain); that's another form of decentralization.
As an open standard, ILP will further reduce vendor lock in (and increase interoperability between) networks that choose to implement it.
There are forks of Ripple (e.g. Stellar) just like there are forks of Bitcoin and Ethereum.
From https://ripple.com/insights/the-inherently-decentralized-nat... :
> In contrast, the XRP Ledger requires 80 percent of validators on the entire network, over a two-week period, to continuously support a change before it is applied. Of the approximately 150 validators today, Ripple runs only 10. Unlike Bitcoin and Ethereum — where one miner could have 51 percent of the hashing power — each Ripple validator only has one vote in support of an exchange or ordering a transaction.
How does your definition of 'decentralized' differ?
[deleted]
Git-signatures – Multiple PGP signatures for your commits
Is there anything out there that doesn't need GPG? Having a working GPG install is a huge lift for developers.
I take this to mean: apart from the barnacles on GPG, could there be a system which does what GPG does for software development (signing), without the non-functioning web-of-trust of GPG, or the hierarchical system of x509 signing? Something that deals with lost keys, compromised keys/accounts, loss of DNS control, MitMing, MitBing, etc?
I think it is probably in the class of problems where there are no great foolproof solutions. However, I can imagine that techniques like certificate transparency (all signed x509 certificates pushed to a shared log) would be quite useful. Even blockchain techniques. Maybe send someone to check on me, I'm feeling unwell having written that.
> I think it is probably in the class of problems where there are no great foolproof solutions. However, I can imagine that techniques like certificate transparency (all signed x509 certificates pushed to a shared log) would be quite useful.
Securing DNS: "https://news.ycombinator.com/item?id=19181362"
> Certs on the Blockchain: "Can we merge Certificate Transparency with blockchain?" https://news.ycombinator.com/item?id=18961724
> Namecoin (decentralized blockchain DNS): https://en.wikipedia.org/wiki/Namecoin
(Your first link is broken.)
My main problem with blockchain is the excessive energy consumption of PoW. I know there are PoS efforts, but they seem problematical.
I like the recent CertLedger paper: https://eprint.iacr.org/2018/1071.pdf
My mistake. How ironic. Everything depends upon the red wheelbarrow. Here's that link without the trailing ": https://news.ycombinator.com/item?id=19181362
> My main problem with blockchain is the excessive energy consumption of PoW. I know there are PoS efforts, but they seem problematical.
One report said that 78% of Bitcoin energy usage is from renewable sources (many of which would otherwise be curtailed and otherwise unfunded due to flat-to-falling demand for electricity). But PoW really is expensive and hopefully the market will choose less energy-inefficient solutions from the existing and future blockchain solutions while keeping equal or better security assurances.
>> Proof of Work (Bitcoin, ...), Proof of Stake (Ethereum Casper), Proof of Space, Proof of Research (GridCoin, CureCoin,)
The spec should be: DDOS resiliant (without a SPOF), no one entity with control over API and/or database credentials and database backups and the clock, and immutable.
Immutability really cannot be ensured with hashed records that incorporate the previous record's hash as a salt in a blocking centralized database because someone ultimately has root and the clock and all the backups and code vulnerable to e.g. [No]SQL injection; though distributed 'replication' and detection of record modification could be implemented. git push -f may be detected if it's on an already-replicated branch; but git depends upon local timestamps. google/trillian does Merkle trees in a centralized database (for Certificate Transparency).
In quickly reading the git-signatures shell script sources, I wasn't certain whether the git-notes branch with the .gitsigners that are fetched from all n keyservers (with DNS) is also signed?
I also like the "Table 1: Security comparison of Log Based Approaches to Certificate Management" in the CertLedger paper. Others are far more qualified to compare implementations.
You read my mind. I'd love if it could be rooted in a Yubikey.
Decoupling the "signing" and "verifying" parts seem like a good idea. As random Person signs something, how someone else figures out how to go trust that signature is a separate problem.
> I'd love if it could be rooted in a Yubikey.
FIDO2 and Yubico helped develop the new W3C WebAuthn standard: https://en.wikipedia.org/wiki/WebAuthn
But WebAuthn does not solve for WoT or PKI or certificate pinning.
> Decoupling the "signing" and "verifying" parts seem like a good idea. As random Person signs something, how someone else figures out how to go trust that signature is a separate problem.
Someone can probably help with terminology here. There's identification (proving that a person has the key AND that it's their key (biometrics, challenge-response)), signing (using a key to create a cryptographic signature – for the actual data or a reasonably secure cryptographic hash of said data – that could only could have been created with the given key), signature verification (checking that the signature was created by the claimed key for the given data), and then there's trusting that the given key is authorized for a specific purpose (Web of Trust (key-signing parties), PKI, ACME, exchange of symmetric keys over a different channel such as QKD) by e.g. signing a structured document that links cryptographic keys with keys for specific authorized functions and trusting the key(s) used to sign said authorizing document.
Private (e.g. Zero Knowledge) blockchains can be used for key exchange and key rotation. Public blockchains can be used for sharing (high-entropy) key components; also with an optional exchange of money to increase the cost of key compromise attempts.
There's also WKD: "Web Key Directory"; which hosts GPG keys over HTTPS from a .well-known URL for a given user@domain identifier: https://wiki.gnupg.org/WKD
Compared to existing PGP/GPG keyservers, WKD does rely upon HTTPS.
TUF is based on Thandy. TUF: "The Update Framework" does not presume channel security (is designed to withstand channel compromise) https://en.wikipedia.org/wiki/The_Update_Framework_(TUF)
The TUF spec doesn't mention PGP/GPG: https://github.com/theupdateframework/specification/blob/mas...
There's a derivative of TUF for automotive applications called Uptane: https://uptane.github.io
The Bitcoin article on multisignature; 1-of-2, 2-of-2, 2-of-3, 3-of-5, etc.: https://en.bitcoin.it/wiki/Multisignature
Running an LED in reverse could cool future computers
"Near-field photonic cooling through control of the chemical potential of photons" (2019) https://www.nature.com/articles/s41586-019-0918-8
Compounding Knowledge
Buffet’s approach to life is interesting for the same reason an Olympic gymnast is interesting. He has specialized to an extreme and is taking advantage of the rewards of that specialization and natural talent in a unique way.
It’s easy for me to feel shame that I don’t read 8 hours per day, as Warren and Charlie do. Buffett is a phenomenal investor but by all accounts, rather odd. He eats like crap, doesn’t exercise, had a profoundly weird relationship with his wife, and seems addicted to his work at the expense of everything else.
My point is this: his practice of reading income statements and business reports 8 hours per day for 60 years doesn’t make him the kind of person I wish to emulate.
Don’t get me wrong, I respect his levelheadedness toward money, lack of polish wrt PR, and generous philanthropic efforts. There’s a lot of good. But it’s easy to idolize the guy.
Controversial take: Buffett is actually not a great investor in the way people think. Via the float in his insurance companies, he receives a 0% infinite maturity loan to plow into the market. That financial leverage gives him the ability to beat the market year after year - not his own stockpicking prowess.
If you were to start with $1B, then get an extra $2B that you never had to pay back, you too would do quite well holding Coca-Cola and other safe companies with moats. Your returns would be 3x everyone else's.
Buffett himself has tried to explain the impact of float on Berkshire Hathaway but it never seems to sink in with people.
Ill take it even farther: Buffet isn't statistically different than average. He just found a strategy that happened to work, was stubborn enough to stick to it through bad times, and used copious amounts of leverage to juice returns.
A really interesting paper called "Buffet's Alpha" talks about this, and was able to replicate his performance by following a few simple rules. They found that he produced very little actual alpha. To his credit, he seems to have been observant enough to stumble into factors(value, quality, and low beta) before anyone else knew they existed, which is his real strength and contribution.
A summary of that paper from the authors is here: https://www.aqr.com/Insights/Research/Journal-Article/Buffet....
They work at AQR, a firm which is notable for being one of the only successful hedge funds to actually publish meaningful research.
The paper's conclusion is noteworthy:
> The efficient market counterargument is that Buffett was simply lucky. Our findings suggest that Buffett’s success is neither luck nor magic but is a reward for a successful implementation of value and quality exposures that have historically produced high returns. Second, we illustrated how Buffett’s record can be viewed as an expression of the practical implementability of academic factor returns after transaction costs and financing costs. We simulated how investors can try to take advan- tage of similar investment principles. Buffett’s success shows that the high returns of these academic factors are not simply “paper” returns; these returns can be realized in the real world after transaction costs and funding costs, at least by Warren Buffett. Furthermore, Buffett’s exposure to the BAB factor and his unique access to leverage are consistent with the idea that the BAB factor represents reward to the use of leverage.
BTW, AQR funded the initial development of pandas; which now powers tools like alphalens (predictive factor analysis) and pyfolio.
There's your 'compounding knowledge'.
(Days later)
"7 Best Community-Built Value Investing Algorithms Using Fundamentals" https://blog.quantopian.com/fundamentals-contest-winners/
(The Zipline backtesting library also builds upon Pandas)
How can we factor ESG/sustainability reporting into these fundamentals-driven algorithms in order to save the world?
I wonder to what extent Warren Buffet is Warren Buffet because of how he thinks and acts (like all of these non-fiction authors selling books by using his name would have us believe), and to what extent he is the product of media selection bias. -- If you take a large enough group of people who take risky stakes that are large enough (like the world of financial asset management), then one of them is bound to be as successful as Warren Buffet, even if they all behave randomly.
Funnily enough, Buffett actually calculated this in his essay "The Super Investors of Graham and Doddsville"
Basically, advocates of the Free Market Hypotheses believed that it was impossible for anyone to deliberately, repeatedly generate alpha. The market was rational, and success was probabilistically distributed. With a large enough population, you will get Buffett level returns - therefor Buffett is a fluke.
However, Buffett calculated that there weren't enough investors for his success to be a factor of random distribution, plus there were 2 dozen others who followed a similar strategy who also consistently generate alpha
"The Superinvestors of Graham and Doddsville" (1984) https://scholar.google.com/scholar?cluster=17265410477248371...
From https://en.wikipedia.org/wiki/The_Superinvestors_of_Graham-a... :
> The speech and article challenged the idea that equity markets are efficient through a study of nine successful investment funds generating long-term returns above the market index.
This book probably doesn't mention that he's given away over 71% to charity since Y2K. Or that it's really cold and windy and snowy in Omaha; which makes for lots of reading time.
"Warren Buffett and the Interpretation of Financial Statements: The Search for the Company with a Durable Competitive Advantage" (2008) [1], "Buffetology" (1999) [2], and "The Intelligent Investor" (1949, 2009) [3] are more investment-strategy-focused texts.
[1] https://smile.amazon.com/Warren-Buffett-Interpretation-Finan...
[2] https://smile.amazon.com/Buffettology-Previously-Unexplained...
[3] https://smile.amazon.com/Intelligent-Investor-Definitive-Inv...
Value Investing: https://en.wikipedia.org/wiki/Value_investing https://www.investopedia.com/terms/v/valueinvesting.asp
Why CISA Issued Our First Emergency Directive
There are a number of efforts to secure DNS (and SSL/TLS which generally depends upon DNS; and upon which DNS-over-HTTPS depends) and the identity proof systems which are used for record-change authentication and authorization.
Domain registrars can and SHOULD implement multi-factor authentication. https://en.wikipedia.org/wiki/Multi-factor_authentication
Are there domain registrars that support FIDO/U2F or the new W3C WebAuthn spec? https://en.wikipedia.org/wiki/WebAuthn
Credentials and blockchains (and biometrics): https://gist.github.com/westurner/4345987bb29fca700f52163c33...
DNSSEC: https://en.wikipedia.org/wiki/Domain_Name_System_Security_Ex...
ACME / LetsEncrypt certs expire after 3 months (*) and require various proofs of domain ownership: https://en.wikipedia.org/wiki/Automated_Certificate_Manageme...
Certificate Transparency: https://en.wikipedia.org/wiki/Certificate_Transparency
Certs on the Blockchain: "Can we merge Certificate Transparency with blockchain?" https://news.ycombinator.com/item?id=18961724
Namecoin (decentralized blockchain DNS): https://en.wikipedia.org/wiki/Namecoin
DNSCrypt: https://en.wikipedia.org/wiki/DNSCrypt
DNS over HTTPS: https://en.wikipedia.org/wiki/DNS_over_HTTPS
DNS over TLS: https://en.wikipedia.org/wiki/DNS_over_TLS
Chrome will Soon Let You Share Links to a Specific Word or Sentence on a Page
I would like to urge the browser developers/makers to adopt existing proposals which came through open consensus which do precisely cover the same use cases (and more!)
W3C Reference Note on Selectors and States: https://www.w3.org/TR/selectors-states/
It is part of the suite of specs that came through the W3C Web Annotation Working Group: https://www.w3.org/annotation/
More examples in W3C Note Embedding Web Annotations in HTML: https://www.w3.org/TR/annotation-html/
Different kinds of Web resources can combine multiple selectors and states. Here is a simple one using `TextQuoteSelector` handled by the https://dokie.li/ clientside application:
http://csarven.ca/dokieli-rww#selector(type=TextQuoteSelecto...
A screenshot/how-to: https://twitter.com/csarven/status/981924087843950595
"Integration with W3C Web Annotations" https://github.com/bokand/ScrollToTextFragment/issues/4
> It would be great to be able to comment on the linked resource text fragment. W3C Web Annotations [implementations] don't recognize the targetText parameter, so AFAIU comments are then added to the document#fragment and not the specified text fragment. [...]
> Is there a simplified mapping of W3C Web Annotations to URI fragment parameters?
Guidelines for keeping a laboratory notebook
My paper notebooks were always horrific. Writing has always been painful and awkward for me. But I could type like nobody's business thanks to programming. I survived college in the early 80s by being one of the first students to get a word processor.
By the time I was keeping a notebook, my work was generating mountains of computer readable data, source code, and so forth. We managed by agreeing on a format for data files, where the filename referenced a notebook page, and it worked OK.
Today, it's unavoidable that people are going to keep their notes electronically, and there are no perfect solutions for doing this. Wet chemists still like paper notebooks, since it's hard to get a computer close to the bench, and to type while wearing rubber gloves. Academic workers are expected to supply their own computers, and are nervous about getting them damaged or contaminated. Plus, drawing pictures and writing equations on a computer are both awkward.
Computation related fields lend themselves well to purely electronic notebooks, no surprise. Today, a lot of my work fits perfectly in a Jupyter notebook.
Commercial notebook software exists, but it tends to be sold largely for enterprise use, i.e., the solution it solves is how to control lab workers and secure their results, not how to enable independent, creative work.
> Computation related fields lend themselves well to purely electronic notebooks, no surprise. Today, a lot of my work fits perfectly in a Jupyter notebook.
Some notes and ideas regarding Jupyter notebooks as lab notebooks from "Keeping a Lab Notebook [pdf]": https://news.ycombinator.com/item?id=15710815
Superalgos and the Trading Singularity
Though others didn't, you might find this interesting: "Ask HN: Why would anyone share trading algorithms and compare by performance?" https://news.ycombinator.com/item?id=15802785 ( https://westurner.github.io/hnlog/#story-15802785 )
I think there is value in a back-testing module, however, sharing an algo doesn't make sense to me, until unless someone wants to buy mine for an absurd amount.
I think part of the value of sharing knowledge and algorithmic implementations comes from getting feedback from other experts; like peer review and open science and teaching.
Case in point: the first algorithm on this list [1] of community contributed algorithms that were migrated to their new platform is "minimum variance w/ constraint" [2]. Said algorithm showed returns of over 200% as compared with 77% returns from the SPY S&P 500 ETF over the same period, ceteris paribus. In the 69 replies, there are modifications by community members and the original author that exceed 300%.
Working together on open algorithms has positive returns that may exceed advantages of closed algorithmic development without peer review.
[1] https://www.quantopian.com/posts/community-algorithms-migrat...
[2] https://www.quantopian.com/posts/56b6021b3f3b36b519000924
How well does it do in production though and what happens when multiple algos execute the same trades? Does it cause the rest of the algos to adapt and change results? It makes sense to back-test together and work on it, but if it's proven to work, someone will create something to monitor volume on those trades and work against it. I'd be curious to see the same algo do 300% in production, and if so, then my bias would be uncalled for.
> How well does it do in production though and what happens when multiple algos execute the same trades?
Price inflation.
> Does it cause the rest of the algos to adapt and change results?
Trading index ETFs? IDK
> It makes sense to back-test together and work on it, but if it's proven to work, someone will create something to monitor volume on those trades and work against it.
Why does it need to do lots of trades? Is it possible for anyone other than e.g. SEC to review trades by buyer or seller?
> I'd be curious to see the same algo do 300% in production, and if so, then my bias would be uncalled for.
pyfolio does tear sheets with Zipline algos: pyfolio/examples/zipline_algo_example.ipynb https://nbviewer.jupyter.org/github/quantopian/pyfolio/blob/...
alphalens does performance analysis of predictive factors: alphalens/examples/pyfolio_integration.ipynb https://nbviewer.jupyter.org/github/quantopian/alphalens/blo...
awesome-quant lists a bunch of other tools for algos and superalgos: https://github.com/wilsonfreitas/awesome-quant
What's a good platform for paper trading (with e.g. zipline or moonshot algorithms)?
I disagree with price inflation just because everything is hedged, but it may be true.
The too many trades is if there are 300 algos, and I look in the order book and see different orders from different exchanges at the same price point, then I would be adapting to see what's happening, not myself, but there are people who watch order flows.
I don't paper trade, either it works in production with real money or not. Have to get a feel for spreads, commissions, and so on.
Also, in my case, I am hesitant to even use paid services as someone can be watching it, so most my tools are made by me. Good luck with your trading though, if it works out, let me know, I'd pay to use it along side my other trades.
Crunching 200 years of stock, bond, currency and commodity data
Coming from a CS/Econ/Finance background....
Efficiency as “the market fully processes all information” is decidedly untrue.
Efficiency as “Its very hard to arb markets without risk better than a passive strategy” is very true.
Most professional money managers don’t beat the market once you factor in fees. Many private equity funds produce outsize returns, but it’s based on higher risk and taking advantage of tax laws.
Over time, positive performance for mutual funds don’t persist. (If you’re in the top decile performance one year, you’re no more likely to be there next year)
Despite all of this, there are ways people make money. But it’s a small subset of professionals. It’s high frequency traders who find ways to cut the line on mounds of pennies. It inside information from hedge funds. Or earlier access to non public information. But it’s generally not in areas that normal people like you and me can access.
It inside information from hedge funds. Or earlier access to non public information. But it’s generally not in areas that normal people like you and me can access.
Asymmetric information is pretty far from what used to be said about the perfect market and rational actors. It's "there's a sucker born every minute" and "if it seems too good to be true it probably is" economics.
I might be misunderstanding what you're saying here, but are you sure you're right? Fama originally predicated the model of the efficient market (the efficient market hypothesis) on the idea of informational efficiency. Information asymmetry is a fundamental measure involved in the idealized model of an efficient market.
What you're mentioning about rational actors is actually a different topic altogether in economics.
Or have I misunderstood what you're getting at?
I was interested, so I did some research here.
Rational Choice Theory https://en.wikipedia.org/wiki/Rational_choice_theory
Rational Behavior https://www.investopedia.com/terms/r/rational-behavior.asp
> Most mainstream academic economics theories are based on rational choice theory.
> While most conventional economic theories assume rational behavior on the part of consumers and investors, behavioral finance is a field of study that substitutes the idea of “normal” people for perfectly rational ones. It allows for issues of psychology and emotion to enter the equation, understanding that these factors alter the actions of investors, and can lead to decisions that may not appear to be entirely rational or logical in nature. This can include making decisions based primarily on emotion, such as investing in a company for which the investor has positive feelings, even if financial models suggest the investment is not wise.
Behavioral finance https://www.investopedia.com/terms/b/behavioralfinance.asp
Bounded rationality > Relationship to behavioral economics https://en.wikipedia.org/wiki/Bounded_rationality
Perfectly rational decisions can be and are made without perfect information; bounded by the information available at the time. If we all had perfect information, there would be no entropy and no advantage; just lag and delay between credible reports and order entry.
Information asymmetry https://en.wikipedia.org/wiki/Information_asymmetry
Heed these words wisely: What foolish games! Always breaking my heart.
https://deepmind.com/blog/game-theory-insights-asymmetric-mu...
> Asymmetric games also naturally model certain real-world scenarios such as automated auctions where buyers and sellers operate with different motivations. Our results give us new insights into these situations and reveal a surprisingly simple way to analyse them. While our interest is in how this theory applies to the interaction of multiple AI systems, we believe the results could also be of use in economics, evolutionary biology and empirical game theory among others.
https://en.wikipedia.org/wiki/Pareto_efficiency
> A Pareto improvement is a change to a different allocation that makes at least one individual or preference criterion better off without making any other individual or preference criterion worse off, given a certain initial allocation of goods among a set of individuals. An allocation is defined as "Pareto efficient" or "Pareto optimal" when no further Pareto improvements can be made, in which case we are assumed to have reached Pareto optimality.
Which, I think, brings me to equitable availability of maximum superalgo efficiency and limits of real value creation in capital and commodities markets; which'll have to be a topic for a different day.
Show HN: React-Schemaorg: Strongly-Typed Schema.org JSON-LD for React
I have a slightly longer post describing this work and the reasoning behind it on dev.to[1].
[1]: https://dev.to/eyassh/react-schemaorg-strongly-typed-schemaorg-json-ld-for-react-4lhdhttps://dev.to/eyassh/react-schemaorg-strongly-typed-schemao...
Is there a good way to generate JSONschema and thus forms from schema.org RDFS classes and (nested, repeatable) properties?
By JSONschema do you mean [this standard](https://json-schema.org/)? I don't know of a tool that does that yet, the JSON Schema is general enough with allOf/anyOf that it should express schema.org schema as well.
Depends on the purpose here. With this, my goal was to speed up the Write-Update-Debug development loop. Depends on the use case, simply using Google's Structured Data Testing Tool [1] might be a better way to verify schema than JSON-schema?
[1]: https://search.google.com/structured-data/testing-toolThere are a number of tools for generating forms and requisite client and serverside data validations from JSONschema; but I'm not aware of any for RDFS (and thus the schema.org schema [1]). A different use case, for certain.
https://schema.org/docs/developers.html#defs
Definitely a cool area for exploration. I'm not aware of JSON Schema generators from RDFS either.
It should be possible to model the basics (nested structure, repeated structure, defining the "domain" and "range" of a value).
Schema.org definitions however have no conception of "required" values[], however, so some of the cool form validation we see in some of these tools might not apply here.
[*] _Consumers_ of Schema.org JSON-LD or microdata, however, might define their own data requirements. E.g. Google has some concept of required fields, which you can see when using the Structured Data Testing Tool.Consumer Protection Bureau Aims to Roll Back Rules for Payday Lending
From the article:
> The way payday loans work is that payday lenders typically offer small loans to borrowers who promise to pay the loans back by their next paycheck. Interest on the loans can have an annual percentage rate of 390 percent or more, according to a 2013 report by the CFPB. Another bureau report from the following year found that most payday loans — as many as 80 percent — are rolled over into another loan within two weeks. Borrowers often take out eight or more loans a year.
390%
From https://www.npr.org/2019/02/06/691944789/consumer-protection... :
> TARP recovered funds totalling $441.7 billion from $426.4 billion invested, earning a $15.3 billion profit or an annualized rate of return of 0.6% and perhaps a loss when adjusted for inflation.[2][3]
0.6%
Is your point that the US government should get into the payday lending business?
Lectures in Quantitative Economics as Python and Julia Notebooks
It's amazing how we are watching use cases for notebooks and spreadsheets converging. I wonder what the killer feature will be to bring a bigger chunk of the Excel world into a programmatic mindset... Or alternatively, whether we will see notebook UIs embedded in Excel in the future in place of e.g. VBA.
That’s not a bad idea. Spreadsheets are pure functional languages built that use literal spaces instead of namespaces.
Notebooks are cells of logic. You could conceivably change the idea of notebook cells to be an instance of a function that points to raw data and returns raw data.
Perhaps this just Alteryx though
This is brilliant.
I'm picturing the ability to write a Python function with the parameters being just like the parameters in an Excel function. You can drag the cell and have it duplicated throughout a row, updating the parameters to correspond to the rows next to it.
It would exponentially expand the power of excel. I wouldn't be limited to horribly unmaintainable little Excel functions.
VBA can't be used to do that, can it? As far as I understand (and I haven't investigated VBA too much) VBA works on entire spreadsheets.
Essentially, replace the excel formula `=B3-B4` with a Python function `subtract(b3, b4)` where Subtract is defined somewhere more conveniently (in a worksheet wide function definition list?).
This would require a reactive recomputing of cells to be anything like a spreadsheet. > Essentially, replace the excel formula `=B3-B4` with a Python function `subtract(b3, b4)`
as of now jupyter/ipython would not recompute `subtract(b3, b4)` if you change b3 or b4, this has positive and negative (reliance on hidden state and order of execution) effects.
I too would really like something like this, but I think it is pretty far away from where jupiter is now.
You can build something like this with Jupyter today.
> Traitlets is a framework that lets Python classes have attributes with type checking, dynamically calculated default values, and ‘on change’ callbacks. https://traitlets.readthedocs.io/en/stable/
> Traitlet events. Widget properties are IPython traitlets and traitlets are eventful. To handle changes, the observe method of the widget can be used to register a callback https://ipywidgets.readthedocs.io/en/stable/examples/Widget%...
You can definitely build interactive notebooks with Jupyter Notebook and JupyterLab (and ipywidgets or Altair or HoloViews and Bokeh or Plotly for interactive data visualization).
> Qgrid is a Jupyter notebook widget which uses SlickGrid to render pandas DataFrames within a Jupyter notebook. This allows you to explore your DataFrames with intuitive scrolling, sorting, and filtering controls, as well as edit your DataFrames by double clicking cells. https://github.com/quantopian/qgrid
Qgrid's API includes event handler registration: https://qgrid.readthedocs.io/en/latest/
> neuron is a robust application that seamlessly combines the power of Visual Studio Code with the interactivity of Jupyter Notebook. https://marketplace.visualstudio.com/items?itemName=neuron.n...
"Excel team considering Python as scripting language: asking for feedback" (2017) https://news.ycombinator.com/item?id=15927132
OpenOffice Calc ships with Python 2.7 support: https://wiki.openoffice.org/wiki/Python
Procedural scripts written in a general purpose language with named variables (with no UI input except for chart design and persisted parameter changes) are reproducible.
What's a good way to review all of the formulas and VBA and/or Python and data ETL in a spreadsheet?
Is there a way to record a reproducible data transformation script from a sequence of GUI interactions in e.g. OpenRefine or similar?
OpenRefine/OpenRefine/wiki/Jupyter
"Within the Python context, a Python OpenRefine client allows a user to script interactions within a Jupyter notebook against an OpenRefine application instance, essentially as a headless service (although workflows are possible where both notebook-scripted and live interactions take place. https://github.com/OpenRefine/OpenRefine/wiki/Jupyter
Are there data wrangling workflows that are supported by OpenRefine but not Pandas, Dask, or Vaex?
This interesting need to have a closer look, possibly refine can be more efficient? But haven't used it enough to know, just payed around with it a bit. Didn't realise you could combine it with jupyter.
There are undergraduate and graduate courses in each language:
Python version: https://lectures.quantecon.org/py/
Julia version: https://lectures.quantecon.org/jl/
Does anyone else find it strange that there is no real-world data in these notebooks? It's all simulations or abstract problems.
This gives me the sense, personally, that economists aren't interested in making accurate predictions about the world. Other fields would, I think, test their theories against observations.
pandas-datareader can pull data from e.g. FRED, Eurostat, Quandl, World Bank: https://pandas-datareader.readthedocs.io/en/latest/remote_da...
pandaSDMX can pull SDMX data from e.g. ECB, Eurostat, ILO, IMF, OECD, UNSD, UNESCO, World Bank; with requests-cache for caching data requests: https://pandasdmx.readthedocs.io/en/latest/#supported-data-p...
The scikit-learn estimator interface includes a .score() method. "3.3. Model evaluation: quantifying the quality of predictions" https://scikit-learn.org/stable/modules/model_evaluation.htm...
statsmodels also has various functions for statistically testing models: https://www.statsmodels.org/stable/
"latex2sympy parses LaTeX math expressions and converts it into the equivalent SymPy form" and is now merged into SymPy master and callable with sympy.parsing.latex.parse_latex(). It requires antlr-python-runtime to be installed. https://github.com/augustt198/latex2sympy https://github.com/sympy/sympy/pull/13706
IDK what Julia has for economic data retrieval and model scoring / cost functions?
If Software Is Funded from a Public Source, Its Code Should Be Open Source
From the US Digital Services Playbook [1]:
> PLAY 13
> Default to open
> When we collaborate in the open and publish our data publicly, we can improve Government together. By building services more openly and publishing open data, we simplify the public’s access to government services and information, allow the public to contribute easily, and enable reuse by entrepreneurs, nonprofits, other agencies, and the public.
> Checklist
> - Offer users a mechanism to report bugs and issues, and be responsive to these reports
> [...]
> - Ensure that we maintain contractual rights to all custom software developed by third parties in a manner that is publishable and reusable at no cost
> [...]
> - When appropriate, publish source code of projects or components online
> [...]
> Key Questions
> [...]
> - If the codebase has not been released under an open source license, explain why.
> - What components are made available to the public as open source?
> [...]
Apache Arrow 0.12.0
> Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust.
Statement on Status of the Consolidated Audit Trail (2018)
U.S. Federal District Court Declared Bitcoin as Legal Money
"Application of FinCEN's Regulations to Persons Administering, Exchanging, or Using Virtual Currencies" (2013) https://www.fincen.gov/resources/statutes-regulations/guidan...
"Legality of bitcoin by country or territory" https://en.wikipedia.org/wiki/Legality_of_bitcoin_by_country...
"Know your customer" https://en.wikipedia.org/wiki/Know_your_customer
"Anti-money-laundering measures by region" https://en.wikipedia.org/wiki/Money_laundering#Anti-money-la...
"Anti-money-laundering measures by region > United States" https://en.wikipedia.org/wiki/Money_laundering#United_States
Post Quantum Crypto Standardization Process – Second Round Candidates Announced
> As the latest step in its program to develop effective defenses, the National Institute of Standards and Technology (NIST) has winnowed the group of potential encryption tools—known as cryptographic algorithms—down to a bracket of 26. These algorithms are the ones NIST mathematicians and computer scientists consider to be the strongest candidates submitted to its Post-Quantum Cryptography Standardization project, whose goal is to create a set of standards for protecting electronic information from attack by the computers of both tomorrow and today.
> “These 26 algorithms are the ones we are considering for potential standardization, and for the next 12 months we are requesting that the cryptography community focus on analyzing their performance,”
Links to the 17 public-key encryption and key-establishment algorithms and 9 digital signature algorithms are here: "Round 2 Submissions" https://csrc.nist.gov/Projects/Post-Quantum-Cryptography/Rou...
"Quantum Algorithm Zoo" has moved to https://quantumalgorithmzoo.org .
Ask HN: How do you evaluate security of OSS before importing?
What tools can I use to evaluate the security posture of an OSS project before I approve its usage with high confidence?
Oddly, whether a project has at least one CVE reported could be interpreted in favor of the project. https://www.cvedetails.com
Do they have a security disclosure policy? A dedicated security mailing list?
Do they pay bounties or participate in e.g Pwn2own?
Do they cryptographically sign releases?
Do they cryptographically sign VCS tags (~releases)? commits? `git tag -s` / `git commit/merge -S` https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work
Downstream packagers do sometimes/often apply additional patches and then sign their release with the repo (and thus system global) GPG key.
Whether they require "Signed-off-by" may indicate that the project has mature controls and possibly a formal code review process requirement. (Look for "Signed-off-by:" in the release branch (`git commit/merge -s/--signoff`)
How have they integrated security review into their [iterative] release workflow?
Is the software formally verified? Are parts of the software implementation or spec formally verified?
Does the system trust the channel? The host? Is it a 'trustless' system?
What are the single points of failure?
How is logging configured? To syslog?
Do they run the app as root in a Docker container? Does it require privileged containers?
If it has to run as root, does it drop privileges at startup?
Does the package have an SELinux or AppArmor policy? (Or does it say e.g. "just set SELinux to permissive mode)
Is there someone you can pay to support the software in an enterprise environment? Open or closed, such contacts basically never accept liability; but if there is an SLA, do you get a pro-rated bill?
As far as indicators of actual software quality:
How much test coverage is there? Line coverage or statement coverage?
Do they run static analysis tools for all pull requests and releases? Dynamic analysis? Fuzzing?
Of course, closed or open source projects may do none or all of these and still be totally secure, insecure, or unsecure.
This is a pretty extensive list. Thanks for sharing!
Ask HN: How can I use my programming skills to support nonprofit organizations?
Lately I've been thinking about doing programming for nonprofits, both because I want to help out with what I'm good at but also to hone my skills and potentially get some open source credit.
So far I've had a hard time finding nonprofit projects where I can just pick up something and start programming. I know about freecodecamp.org, but they force you to go through their courses, and as I already have multiple years of experience as a developer, I feel like that would be a waste of time.
Isn't there a way to contribute to nonprofit organization in a more direct and simple manner like how you would contribute to an open source project on GitHub?
There are lots of project management systems with issue tracking and kanban boards with swimlanes. Because it's unreasonable to expect all volunteers to have a GH account or even understand what GH is for, support for external identity management and SSO may be essential to getting people to actually log in and change their password regularly.
Sidling a nonprofit with custom built software with no other maintainers is not what they need. Build (and pay for development, maintenance, timely security upgrades and security review) or Buy (where is our data? who backs it up? how much does it cost for a month or a few years? Is it open source with a hosted option; so that we can pay a developer to add or fix what we need?)
"Solutions architect" may be a more helpful objective title for what's needed. https://en.wikipedia.org/wiki/Solution_architecture
What are their needs? Marketing, accounting, operations, HR
Marketing: web site, maps service, directions, active social media presence that speaks to their defined audience
Accounting: Revenue and expenses, payroll/benefits/HR, projections, "How can we afford to do more?", handle donations and send receipts for tax purposes, reports to e.g. https://charitynavigator.org/ and infographics for wealth-savvy donors
Operations: Asset inventory, project management, volunteer scheduling
HR: payroll, benefits, volunteer scheduling, training, turnover, retaining selfless and enlightenedly-self-interested volunteers
Create a spreadsheet. Rows: needs/features/business processes. Columns: essential, nice to have, software products and services.
Create another spreadsheet. Rows: APIs. Columns: APIs.
Training: what are the [information systems] processes/workflows/checklists? How can I suggest a change? How do we reach consensus that there's a better way to do this? Is there a wiki? Is there a Q&A system?
"How much did you sink on that? Probably seemed like the best option according to the information available at the time, huh? Do you have a formal systems acquisition process? Who votes according to what type of prepared analysis? How much would it cost to switch? What do we need to do to ETL (extract, transform, and load) into a newer better system?"
When estimating TCO for a nonprofit, turnover is a very real consideration. People move. Chances are, as with most organizations TBH, there's a patchwork of partially-integrated and maybe-integrable systems that it may or may not be more cost-effective and maintainable to replace with a cloud ERP specifically designed for nonprofits.
Who has access rights to manually update which parts of the website? How can we include dynamic ([other] database-backed) content in our website? What is a CMS? What is an ERP? What is a CRM? Are these customers, constituents, or both? When did we last speak with those guys? How can people share our asks with social media networks?
If you're not willing or able to make a long-term commitment, the more responsible thing to do is probably to disclose any conflicts of interest recommend a SaaS solution hosted in a compliant data center.
q="nonprofit erp"
q="nonprofit crm"
q="nonprofit cms" + donation campaign visibility
What time of day are social media posts most likely to get maximum engagement from which segments of our audience? What is our ~ARPU "average revenue per user/follower"?
... As a volunteer and not a FTE, it may be a worthwhile exercise to build a prototype of the new functionality with whatever tools you happen to be familiar with with the expectation that they'll figure out a way to accomplish the same objectives with their existing systems. If that's not possible, there may be a business opportunity: are there other organizations with the same need? Is there a sustainable market for such a solution? You may be building to be acquired.
Ask HN: Steps to forming a company?
Hey guys, I'm leaving my firm very shortly to form a startup.
Does why have a checklist of proper ways to do things?
Ie. 1. Form Chapter C Delaware company with Clerky 2. Hire payroll company x 3. use this company for patents.
any info there?
From "Ask HN: What are your favorite entrepreneurship resources" https://news.ycombinator.com/item?id=15021659 :
> USA Small Business Administration: "10 steps to start your business." https://www.sba.gov/starting-business/how-start-business/10-...
> "Startup Incorporation Checklist: How to bootstrap a Delaware C-corp (or S-corp) with employee(s) in California" https://github.com/leonar15/startup-checklist
> FounderKit has reviews for Products, Services, and Software for founders: https://founderkit.com
... I've heard good things about Gusto for payroll, HR, and benefits through Guideline: https://gusto.com/product/pricing
A Self-Learning, Modern Computer Science Curriculum
Outstanding resource.
jwasham/coding-interview-university also links to a number of also helpful OER resources: https://github.com/jwasham/coding-interview-university
MVP Spec
> The criticism of the MVP approach has led to several new approaches, e.g. the Minimum Viable Experiment MVE[19] or the Minimum Awesome Product MAP[20].
https://en.wikipedia.org/wiki/Minimum_viable_product#Critici...
Can we merge Certificate Transparency with blockchain?
From "REMME – A blockchain-based protocol for issuing X.509 client certificates" https://news.ycombinator.com/item?id=18868540 :
""" In no particular order, there are a number of blockchain PKI (and DNS (!)) proposals and proofs of concept.
"CertLedger: A New PKI Model with Certificate Transparency Based on Blockchain" (2018) https://arxiv.org/pdf/1806.03914 https://scholar.google.com/scholar?q=related:LF9PMeqNOLsJ:sc...
"TABLE 1: Security comparison of Log Based Approaches to Certificate Management" (p.12) lists a number of criteria for blockchain-based PKI implementations:
- Resilient to split-world/MITM attack
- Provides revocation transparency
- Eliminates client certificate validation process
- Eliminates trusted key management
- Preserves client privacy
- Require external auditing
- Monitoring promptness
... These papers also clarify why a highly-replicated decentralized trustless datastore — such as a blockchain — is advantageous for PKI. WoT is not mentioned.
"Blockchain-based Certificate Transparency and Revocation Transparency" (2018) https://fc18.ifca.ai/bitcoin/papers/bitcoin18-final29.pdf
https://scholar.google.com/scholar?q=related:oEsKmJvdn-MJ:sc...
Who can update and revoke which records in a permissioned blockchain (or a plain old database, for that matter)?
Letsencrypt has a model for proving domain control with ACME; which AFAIU depends upon DNS, too. """
TLA references "Certificate Transparency Using Blockchain" (2018) https://eprint.iacr.org/2018/1232.pdf https://scholar.google.com/scholar?q="Certificate+Transparen...
Thanks for the references! The main issue isn't the support and maintenance of a such distributed network, but its integration with current solutions and avoiding centralized middleware services that will weaken the schema described in the documents.
> The main issue isn't the support and maintenance of a such distributed network,
Running a permissioned blockchain is nontrivial. "Just fork XYZ and call it a day" doesn't quite describe the amount of work involved. There's read latency at scale. There's merging things to maintain vendor strings,
> but its integration with current solutions
- Verify issuee identity
- Update (domain/CN/subjectAltName, date) index
- Update cached cert and CRL bundles
- Propagate changes to all clients
> and avoiding centralized middleware services that will weaken the schema described in the documents.
Eventually, a CDN will look desireable. IPFS may fit the bill, IDK?
> Running a permissioned blockchain is nontrivial.
You are right. It's needed a relevant BFT protocol, a lot of work with masternodes community and smart economic system inside. You can look at an example of a such protocol: https://github.com/Remmeauth/remme-core/tree/dev
google/trillian https://github.com/google/trillian
> Trillian is an implementation of the concepts described in the Verifiable Data Structures white paper, which in turn is an extension and generalisation of the ideas which underpin Certificate Transparency.
> Trillian implements a Merkle tree whose contents are served from a data storage layer, to allow scalability to extremely large trees.
Why Don't People Use Formal Methods?
Which universities teach formal methods?
- q=formal+verification https://www.class-central.com/search?q=formal+verification
- q=formal-methods https://www.class-central.com/search?q=formal+methods
Is formal verification a required course or curriculum competency for any Computer Science or Software Engineering / Computer Engineering degree programs?
Is there a certification for formal methods? Something like for engineer-status in other industries?
What are some examples of tools and [OER] resources for teaching and learning formal methods?
- JsCoq
- Jupyter kernel for Coq + nbgrader
- "Inconsistencies, rolling back edits, and keeping track of the document's global state" https://github.com/jupyter/jupyter/issues/333 (jsCoq + hott [+ IJavascript Jupyter kernel], STLC: Simply-Typed Lambda Calculus)
- TDD tests that run FV tools on the spec and the implementation
What are some examples of open source tools for formal verification (that can be integrated with CI to verify the spec AND the implementation)?
What are some examples of formally-proven open source projects?
- "Quark : A Web Browser with a Formally Verified Kernel" (2012) (Coq, Haskell) http://goto.ucsd.edu/quark/
What are some examples of projects using narrow and strong AI to generate perfectly verified software from bad specs that make the customers and stakeholders happy?
From reading though comments here, people don't use formal methods because: cost-prohibitive, inflexibile, perceived as incompatible with agile / iterative methods that are more likely to keep customers who don't know what formal methods are happy, lack of industry-appropriate regulation, and cognitive burden of often-incompatible shorthand notations.
Almost University of New South Wales (Sydney, Australia) alum, and it's the biggest difference between the Software Engineering course and Comp Sci. There are two mandatory formal methods and more additional ones to take. They are really hard, and mean a lot of the SENG cohort ends up graduating as COMPSCI, since they can't hack it and don't want to repeat formal methods.
Steps to a clean dataset with Pandas
To add to the three points in the article:
Data quality https://en.wikipedia.org/wiki/Data_quality
Imputation https://en.wikipedia.org/wiki/Imputation_(statistics)
Feature selection https://en.wikipedia.org/wiki/Feature_selection
datacleaner can drop NaNs, do imputation with "the mode (for categorical variables) or median (for continuous variables) on a column-by-column basis", and encode "non-numerical variables (e.g., categorical variables with strings) with numerical equivalents" with Pandas DataFrames and scikit-learn. https://github.com/rhiever/datacleaner
sklearn-pandas "[maps] DataFrame columns to transformations, which are later recombined into features", and provides "A couple of special transformers that work well with pandas inputs: CategoricalImputer and FunctionTransformer" https://github.com/scikit-learn-contrib/sklearn-pandas
Featuretools https://github.com/Featuretools/featuretools
> Featuretools is a python library for automated feature engineering. [using DFS: Deep Feature Synthesis]
auto-sklearn does feature selection (with e.g. PCA) in a "preprocessing" step; as well as "One-Hot encoding of categorical features, imputation of missing values and the normalization of features or samples" https://automl.github.io/auto-sklearn/master/manual.html#tur...
auto_ml uses "Deep Learning [with Keras and TensorFlow] to learn features for us, and Gradient Boosting [with XGBoost] to turn those features into accurate predictions" https://auto-ml.readthedocs.io/en/latest/deep_learning.html#...
Reahl – A Python-only web framework
I feel bad about projects like these, I really do. I'm sure a lot of effort have been put in and it probably satisfies what OP wants to do. But in the end, no one serious would ever consider using it.
>But in the end, no one serious would ever consider using it.
Why not?
I would say it is the other way around. For any web application project of decent size there is almost always a transpilation step that converts whatever programming language your source code is in, into JavaScript (usually ES5). We also had very successful and widely used projects like GWT for years (GWT was first released in 2006!!!).
Before GWT, there was Wt framework (C++); and then JWt (Java), which do the server and clientsides (with widgets in a tree).
Wt: https://en.wikipedia.org/wiki/Wt_(web_toolkit)
JWt: https://en.m.wikipedia.org/wiki/JWt_(Java_web_toolkit)
GWT: https://en.wikipedia.org/wiki/Google_Web_Toolkit
Now we have Babel, ES YYYY, and faster browser release cycles.
[deleted]
Ask HN: How can you save money while living on poverty level?
I freelance remotely, making roughly $1200 a month as a programmer because I only work 10 hours maximum each week (limited by my contract). I share the apartment with my mom, and It's a section 8 so our rent contributions are based on the income we make. My contribution towards rent is $400 a month.
Although I make more money than my mom (she's of retirement age and only works 1-2 days a week), while I'm looking for more work I want to figure out how to move out and live more independently on only $1200 a month.
I need to live frugally and want to know what I can cut more easily. I own a used car (already paid in full), and pay my own car insurance, electricity, phone and internet. After all that I have about $400 left each month which can be eaten up by going out or some emergency funds.
More recently I had to pay for my new city parking sticker so that's $100 more in expenses this particular month. I would be satisfied just living in a far off town paying the same $400 a month, I feel my dollars would stretch further since I now get 100% more privacy for the same price.
On top of that this job is a contract job so I need to put money aside to pay my own taxes. This $1200 is basically living on poverty level. Any ideas to make saving work? Is it very possible for people in the US to still save while on poverty?
That's not a living wage (or a full time job). There are lots of job search sites.
Spending some time on a good resume / CV / portfolio would probably be a good investment with positive ROI.
Is there a nonprofit that you could volunteer with to increase your hireability during the other 158 hours of the week?
Or an online course with a credential that may or may not have positive ROI as a resume item?
Is there a code school in your city with a "you don't pay unless you land a full time job with a living wage and benefits" guarantee?
What is your strategy for business and career networking?
From https://westurner.github.io/hnlog/#comment-17894632 :
> Personal Finance (budgets, interest, growth, inflation, retirement)
Personal Finance https://en.wikipedia.org/wiki/Personal_finance
Khan Academy > College, careers, and more > Personal finance https://www.khanacademy.org/college-careers-more/personal-fi...
"CS 007: Personal Finance For Engineers" https://cs007.blog
A DNS hijacking wave is targeting companies at an almost unprecedented scale
> The National Cybersecurity and Communications Integration Center issued a statement [1] that encouraged administrators to read the FireEye report. [2]
[1] https://www.us-cert.gov/ncas/current-activity/2019/01/10/DNS...
[2] https://www.fireeye.com/blog/threat-research/2019/01/global-...
Show HN: Generate dank mnemonic seed phrases in the terminal
From https://github.com/lukechilds/doge-seed :
> The first four words will be a randomly generated Doge-like sentence.
The seed phrases are fully valid checksummed BIP39 seeds. They can be used with any cryptocurrency and can be imported into any BIP39 compliant wallet.
> […] However there is a slight reduction in entropy due to the introduction of the doge-isms. A doge seed has about 19.415 fewer bits of entropy than a standard BIP39 seed of equivalent length.
Can you sign a quantum state?
> Abstract. Cryptography with quantum states exhibits a number of surprising and counterintuitive features. In a 2002 work, Barnum et al. argued informally that these strange features should imply that digital signatures for quantum states are impossible [6].
> In this work, we perform the first rigorous study of the problem of signing quantum states. We first show that the intuition of [6] was correct, by proving an impossibility result which rules out even very weak forms of signing quantum states. Essentially, we show that any non-trivial combination of correctness and security requirements results in negligible security.
> This rules out all quantum signature schemes except those which simply measure the state and then sign the outcome using a classical scheme. In other words, only classical signature schemes exist.
> We then show a positive result: it is possible to sign quantum states, provided that they are also encrypted with the public key of the intended recipient. Following classical nomenclature, we call this notion quantum signcryption. Classically, signcryption is only interesting if it provides superior efficiency to simultaneous encryption and signing. Our results imply that, quantumly, it is far more interesting: by the laws of quantum mechanics, it is the only signing method available.
> We develop security definitions for quantum signcryption, ranging from a simple one-time two-user setting, to a chosen-ciphertext-secure many-time multi-user setting. We also give secure constructions based on post-quantum public-key primitives. Along the way, we show that a natural hybrid method of combining classical and quantum schemes can be used to “upgrade” a secure classical scheme to the fully-quantum setting, in a wide range of cryptographic settings including signcryption, authenticated encryption, and chosen-ciphertext security.
"Quantum signcryption"
Lattice Attacks Against Weak ECDSA Signatures in Cryptocurrencies [pdf]
From the paper:
Abstract. In this paper, we compute hundreds of Bitcoin private keys and dozens of Ethereum, Ripple, SSH, and HTTPS private keys by carrying out cryptanalytic attacks against digital signatures contained in public blockchains and Internet-wide scans.
REMME – A blockchain-based protocol for issuing X.509 client certificates
pity it's blockchain based
It's unclear to me why you would want a distributed PKI to authenticate a centralized app. Or maybe it's only for dapps?
In no particular order, there are a number of blockchain PKI (and DNS (!)) proposals and proofs of concept.
"CertLedger: A New PKI Model with Certificate Transparency Based on Blockchain" (2018) https://arxiv.org/pdf/1806.03914 https://scholar.google.com/scholar?q=related:LF9PMeqNOLsJ:sc...
"TABLE 1: Security comparison of Log Based Approaches to Certificate Management" (p.12) lists a number of criteria for blockchain-based PKI implementations:
- Resilient to split-world/MITM attack
- Provides revocation transparency
- Eliminates client certificate validation process
- Eliminates trusted key management
- Preserves client privacy
- Require external auditing
- Monitoring promptness
... These papers also clarify why a highly-replicated decentralized trustless datastore — such as a blockchain — is advantageous for PKI. WoT is not mentioned.
"Blockchain-based Certificate Transparency and Revocation Transparency" (2018) https://fc18.ifca.ai/bitcoin/papers/bitcoin18-final29.pdf
https://scholar.google.com/scholar?q=related:oEsKmJvdn-MJ:sc...
Who can update and revoke which records in a permissioned blockchain (or a plain old database, for that matter)?
Letsencrypt has a model for proving domain control with ACME; which AFAIU depends upon DNS, too.
California grid data is live – solar developers take note
> It looks like California is at least two generations of technology ahead of other states. Let’s hope the rest of us catch up, so that we have a grid that can make an asset out of every building, every battery, and every solar system.
+1. Are there any other states with similar grid data available for optimization; or any plans to require or voluntarily offer such a useful capability?
Why attend predatory colleges in the US?
> Why would people attend predatory colleges?
Why would people make an investment with insufficient ROI (Return on Investment)?
Insufficient information.
College Scorecard [1] is a database with a web interface for finding and comparing schools according to a number of objective criteria. CollegeScorecard launched in 2015. It lists "Average Annual Cost", "Graduation Rate", and "Salary After Attending" on the search results pages. When you review a detail page for an institution, there are many additional statistics; things like: "Typical Total Debt After Graduation" and "Typical Monthly Loan Payment".
The raw data behind CollegeScorecard can be downloaded from [2]. The "data_dictionary" tab of the "Data Dictionary" spreadsheet describes the data schema.
[1] https://collegescorecard.ed.gov
[2] https://collegescorecard.ed.gov/data/
Khan Academy > "College, careers, and more" [3] may be a helpful supplement for funding a full-time college admissions counselor in a secondary education institution
[3] https://www.khanacademy.org/college-careers-more
(I haven't the time to earn 10 academia.stackexchange points in order to earn the prestigious opportunity to contribute this answer to such a forum with threaded comments. In the academic journal system, journals sell academics' work (i.e. schema.org/ScholarlyArticle PDFs, mobile-compatible responsive HTML 5, RDFa, JSON-LD structured data) and keep all of the revenue).
"Because I need money for school! Next question. CPU: College Textbook costs and CPI: All over time t?!"
Ask HN: Data analysis workflow?
What kind of workflow do you employ when designing a data-flow or analyzing data?
Let me give a concrete example. For the past year, I have been selling stuff on the interwebs through two payment processors one of them being PayPal.
The selling process was put together with a bunch of SaaS hooking everything together through webhooks and notifications.
Now I need to step it that control and produce a proper flow to handle sign up, subscription and payment.
Before doing that I'm analyzing and trying to conciliate all transactions to make sure the books are OK and nothing went unseen. There lies the problem. I have data coming from different sources such as databases, excel files, CSV exports and some JSON files.
At first, I started dealing with it by having all the data in CSV files and trying to make sense of them using code and running queries within the code.
As I found holes in the data I had to dig up more data from different sources and it became a pain to continue with code. I now imported everything into Postgres and have been "debugging" with SQL.
As I advanced through the process I had to generate a lot of routines to collect and match data. I also have to keep all the data files around and organized which is very hard to do because I'm all over the place trying to find where the problem is.
How do you handle with it? What kind of workflow? Any best practices or recommendations from people who do this for a living?
Pachyderm may be basically what you're looking for. It does data version control with/for language-agnostic pipelines that don't need to always redo the ETL phase. https://www.pachyderm.io
Dask-ML works with {scikit-learn, xgboost, tensorflow, TPOT,}. ETL is your responsibility. Loading things into parquet format affords a lot of flexibility in terms of (non-SQL) datastores or just efficiently packed files on disk that need to be paged into/over in RAM. http://ml.dask.org/examples/scale-scikit-learn.html
Sklearn.pipeline.Pipeline API: {fit(), transform(), predict(), score(),} https://scikit-learn.org/stable/modules/generated/sklearn.pi...
https://docs.featuretools.com can also minimize ad-hoc boilerplate ETL / feature engineering :
> Featuretools is a framework to perform automated feature engineering. It excels at transforming temporal and relational datasets into feature matrices for machine learning.
The PLoS 10 Simple Rules papers distill a number of best practices:
"Ten Simple Rules for Reproducible Computational Research" http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj...
“Ten Simple Rules for Creating a Good Data Management Plan” http://journals.plos.org/ploscompbiol/article?id=10.1371/jou...
In terms of the scientific method, a null hypothesis like "there is no significant relation between the [independent and dependent] variables" may be dangerously unprofessional p-hacking and data dredging; and may result in an overfit model that seems to predict or classify the training and test data (when split with e.g. sklearn.model_selection.train_test_split and a given random seed).
One of these days (in the happy new year!) I'll get around to updating these notes with the aforementioned tools and docs: https://wrdrd.github.io/docs/consulting/data-science#scienti...
IDK what https://kaggle.com/learn has specifically in terms of analysis workflow? Their docker containers have very many tools configured in a reproducible way: https://github.com/Kaggle/docker-python/blob/master/Dockerfi...
Ask HN: What is your favorite open-source job scheduler
Too many business scripts rely on cron(8) to run. Classic cron cannot handle task duration, fail (only with email), same-task piling, linting, ...
So what is your favorite open-source, easy to bundle/deploy job scheduler, that is easy to use, has logging capacity, config file linting, and can handle common use-cases : kill if longer than, limit resources, prevent launching when previous one is nor finished, ...
systemd-crontab-generator may be usable for something like linting classic crontabs? https://github.com/systemd-cron/systemd-cron
Systemd/Timers as a cron replacement: https://wiki.archlinux.org/index.php/Systemd/Timers#As_a_cro...
Celery supports periodic tasks:
> Like with cron, the tasks may overlap if the first task doesn’t complete before the next. If that’s a concern you should use a locking strategy to ensure only one instance can run at a time (see for example Ensuring a task is only executed one at a time).
http://docs.celeryproject.org/en/latest/userguide/periodic-t...
How to Version-Control Jupyter Notebooks
Mentioned in the article: manual nbconvert, nbdime, ReviewNB (currently GitHub only), jupytext.
Jupytext includes a bit of YAML in the e.g. Python/R/Julia/Markdown header. https://github.com/mwouts/jupytext
Huge +1s for both nbdime and jupytext. Excellent tools both.
Really enjoying jupytext. I do a bunch of my training from Jupyter and it has made my workflow better.
Teaching and Learning with Jupyter (A book by Jupyter for Education)
Still under heavy development, but already useful. Add we accept Pull Requests to add items or fix issues. Join us!
A small suggestion.
Since Jupyter Notebook can be used for both programming and documentation, why don't you use Jupyter Notebook itself as the source of your document?
It is actually very easy to setup a Jupyter Notebook driven .ipynb -> .html publishing pipeline with github + a local Jupyter instance
Here is a toy example (for my own github page)
https://ontouchstart.github.io/
https://github.com/ontouchstart/ontouchstart.github.io
The convert script is here (also a Jupyter Notebook)
https://github.com/ontouchstart/ontouchstart.github.io/blob/...
You got the ideas.
BTW, to make the system fully replicable, I use docker for the local Jupyter instance, which can be launched via the Makefile
https://github.com/ontouchstart/ontouchstart.github.io/blob/...
Here is the custom Dockerfile:
https://github.com/ontouchstart/ontouchstart.github.io/blob/...
Margin Notes: Automatic code documentation with recorded examples from runtime
Slightly related are Go examples - they're tests, and documentation at the same time. It'd be nice if someone hooks in a listener to automatically collect examples tho
And Elixir's doctests https://elixir-lang.org/getting-started/mix-otp/docs-tests-a...
And Python's Axe!
I mean doctest: https://docs.python.org/3/library/doctest.html
1. sys.settrace() for {call, return, exception, c_call, c_return, and c_exception}
2. Serialize as/to doctests. Is there a good way to serialize Python objects as Python code?
3. Add doctests to callables' docstrings with AST
Mutation testing tools may have already implemented serialization to doctests but IDK about docstring modification.
... MOSES is an evolutionary algorithm that mutates and simplifies a combo tree until it has built a function with less error for the given input/output pairs.
Time to break academic publishing's stranglehold on research
Add in that the quality of the system is massively massively broken...peer review is about as accurate as the flip of a coin. It does not promote gorund breaking or novel research, it barely (arguably doesn't) even contribute to quality research. I had a colleague recently be told by a journal editor 'we don't publish critiques from junior scholars.' So much for the nature of peer review being entirely driven by the quality of the work.
As one of those academics...I keep getting requests to peer review, I respectfully make clear I don't review for non open source journals anymore. Same with publishing. I'm not tenure-track so am not primarily evaluated based on output.
Publishing is broken, but it is really just part of the broader and even more broken nature of academic research.
What would be a good alternative to peer-review though? Genuinely interested.
Open publishing and commenting would be a good start---having a dialog, like is done at conferences. Older academic journal articles (pre 1900) read much more like discussions than like the hundred dollar word vomits of modern academic publishing. The broken incentives are at the core of this rotten fruit, though. Just making journals open isn't enough.
We have (almost) open publishing and open commenting. Did that improve anything?
There's open commenting? I've never seen the back and forth of the review process be published. It should be published.
https://hypothes.is supports threaded comments on anything with a URI; including PDFs and specific sentences or figures thereof. All you have to do is register an account and install the browser extension or include the JS in the HTML.
It's based on open standards and an open platform.
W3C Web Annotations: http://w3.org/annotation
About Hypothesis: https://web.hypothes.is/about/
Ask HN: How can I learn to read mathematical notation?
There are a lot of fields I'm interested in, such as machine learning, but I struggle to understand how they work as most resources I come across are full of complex mathematical notation that I never learned how to read in school or University.
How do you learn to read this stuff? I'm frequently stumped by an academic paper or book that I just can't understand due to mathematical notation that I simply cannot read.
https://en.wikipedia.org/wiki/List_of_logic_symbols
https://en.wikipedia.org/wiki/Table_of_mathematical_symbols
These might help a bit.
But as someone with similar problems, I'm beginning to think there's no real solution other than thousands of hours of studying.
There are a number of Wikipedia pages which catalog various uses of symbols for various disciplines:
Outline_of_mathematics#Mathematical_notation https://en.wikipedia.org/wiki/Outline_of_mathematics#Mathema...
List_of_mathematical_symbols https://en.wikipedia.org/wiki/List_of_mathematical_symbols
List_of_mathematical_symbols_by_subject https://en.wikipedia.org/wiki/List_of_mathematical_symbols_b...
Greek_letters_used_in_mathematics,_science,_and_engineering https://en.wikipedia.org/wiki/Greek_letters_used_in_mathemat...
Latin_letters_used_in_mathematics https://en.wikipedia.org/wiki/Latin_letters_used_in_mathemat...
For learning the names of symbols (and maybe also their meaning as conventially utilized in a particular field at a particular time in history), spaced repetition with flashcards with a tool like Anki may be helpful.
For typesetting, e.g. Jupyter Notebook uses MathJax to render LaTeX with JS.
latex2sympy may also be helpful for learning notation.
… data-science#mathematical-notation https://wrdrd.github.io/docs/consulting/data-science#mathema...
New law lets you defer capital gains taxes by investing in opportunity zones
I believe the way that the program works is you can defer taxes from your original capital gains (and the cost basis gets increased so your deferred taxes are less than you would pay otherwise), reinvest them in an "opportunity zone", and not pay capital gains on your investment on the opportunity zone, if you hold it long enough
e.g. Bob bought Apple stock for $50 a share back in the day, and sells it for $200/share. He defers his taxes until 2026. Instead of paying capital gains tax on $150/share, the cost basis is adjusted by 15% so in addition to benefiting from the time value of money, future Bob will only be taxed on $142.50 of capital gains. Bob can buy a house in an "opportunity zone" (from scrolling around the embedded map, there's plenty of million dollar+ houses in these areas. There's also lots of sports teams and stadiums in these areas, so maybe Bob buys an NFL team or a parking lot next to their stadium), rent it out for 10 years, sell it, and not have to pay any capital gains tax on the appreciation. Definitely not a bad deal for him!
Yes its more akin to a 1031 exchange, but opened up so that non-real estate capital gains are eligible.
Is it just capital gains? Wondering if it applies to any other forms of active or passive income.
How are profits from these investments treated?
Can you "swap til you drop" like with a 1031 exchange?
> Is it just capital gains? Wondering if it applies to any other forms of active or passive income.
I would also like some information about this.
+1 for investing in distressed areas; self-nominated with intent or otherwise.
If it's capital gains only, -1 on requiring sale of capital assets in order to be sufficiently incentivized. (Because then the opportunity to tax-advantagedly invest in Opportunity Zones is denied to persons without assets to liquidate; i.e. unequal opportunity).
How to Write a Technical Paper [pdf]
A few years ago i found a great version of a similar piece that proposed something like a 'sandwich' model for each section, and the work as a whole. A sandwich layer was a hook (something interesting), the meat (what you actually did), and a summary.
I failed to save it and i haven't been able to dig it up again, but I liked the idea. The paper was written in the style it described, as well.
5 paragraph essay? https://en.wikipedia.org/wiki/Five-paragraph_essay
> The five-paragraph essay is a format of essay having five paragraphs: one introductory paragraph, three body paragraphs with support and development, and one concluding paragraph. Because of this structure, it is also known as a hamburger essay, one three one, or a three-tier essay.
The digraph presented in the OP really is a great approach, IMHO:
## Introduction
## Related Work, System Model, Problem Statement
## Your Solution
## Analysis
## Simulation, Experimentation
## Conclusion
... "Elements of the scientific method" https://en.wikipedia.org/wiki/Scientific_method#Elements_of_...
no, it was on arxiv somewhere. It was written as a journal article.
edit: AH! it was linked upthread on bioarxiv.
JSON-LD 1.0: A JSON-Based Serialization for Linked Data
JSON-LD 1.1 https://www.w3.org/TR/json-ld11/
"Changes since 1.0 Recommendation of 16 January 2014" https://www.w3.org/TR/json-ld11/#changes-from-10
Jeff Hawkins Is Finally Ready to Explain His Brain Research
Cortical column: https://en.wikipedia.org/wiki/Cortical_column
> In the neocortex 6 layers can be recognized although many regions lack one or more layers, fewer layers are present in the archipallium and the paleopallium.
What this means in terms of optimal artificial neural network architecture and parameters will be interesting to learn about; in regards to logic, reasoning, and inference.
According to "Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function" https://www.frontiersin.org/articles/10.3389/fncom.2017.0004... , the human brain appears to be [at most] 11-dimensional (11D); in terms of algebraic topology https://en.wikipedia.org/wiki/Algebraic_topology
Relatedly,
"Study shows how memories ripple through the brain" https://www.ninds.nih.gov/News-Events/News-and-Press-Release...
> The [NeuroGrid] team was also surprised to find that the ripples in the association neocortex and hippocampus occurred at the same time, suggesting the two regions were communicating as the rats slept. Because the association neocortex is thought to be a storage location for memories, the researchers theorized that this neural dialogue could help the brain retain information.
Re: Topological graph theory [1], is it possible to embed a graph on a space filling curve [2] (such as a Hilbert R-tree [3])?
[1] https://en.wikipedia.org/wiki/Topological_graph_theory
[2] https://en.wikipedia.org/wiki/Space-filling_curve
[3] https://en.wikipedia.org/wiki/Hilbert_R-tree
[4] https://github.com/bup/bup (git packfiles)
Interstellar Visitor Found to Be Unlike a Comet or an Asteroid
It's too bad we're not ready to launch probes to visit and explore such transient objects at a moment's notice.
I think it would make sense to plan a probe mission where the probe would be put into storage to stand by for these kind of events. It would still be a challenge and might be impossible to find a launch window and reach the passing object in time, but it would be worth trying.
Just imagine how tragic it would be, if an ancient artifact of an alien civilization drifts by earth and we don't manage to have at least a look at it.
Wouldn’t such an object be emitting radio signals that earth could receive?
Not if it's something like another civilization's Tesla Roadster.
> Not if it's something like another civilization's Tesla Roadster.
'Oumuamua is red and headed toward Pegasus (the winged horse) after a very long journey starting longtime in spacetime ago. It is wildly tumbling off-kilter and potentially creating a magnetic field that would be useful for interplanetary spacetravel.
They're probably pointing us to somewhere else from somewhere else.
If this is any indication of the state of another civilization's advanced physics, and it missed us by a wide margin, they're probably laughing at our energy and water markets; and indicating that we should be focused on asteroid impact avoidance (and then we will really laugh about rockets and red electromagnetic kinetic energy machines and asteroid mining). https://en.wikipedia.org/wiki/Asteroid_impact_avoidance
"Amateurs"
[We watch it fly by, heads all turning]
Maybe it would've been better to have put alone starman in the passenger seat or two starpeoples total?
Given the skull shape of October 2015 TB145 [1] (due to return in November 2018), maybe 'Oumuamua [2] is a pathology of Mars and an acknowledgement of our spacefaring intentions? Red, subsurface water, disrupted magnetic field.
[1] https://en.wikipedia.org/wiki/2015_TB145
[2] https://en.wikipedia.org/wiki/%CA%BBOumuamua
In regards to a red, unshielded, earth vehicle floating in solar orbit with a suited anthropomorphic creature whose head is too big for the windshield:
"What happened here?"
Publishing more data behind our reporting
Publishing raw data itself is definitely a good start but there also needs to be a push towards a standardized way of sharing data along with it's lineage (dependent sources, experimental design/generation process, metadata, graph relationship of other uses, etc.).
> Publishing raw data itself is definitely a good start but there also needs to be a push towards a standardized way of sharing data along with it's lineage (dependent sources, experimental design/generation process, metadata, graph relationship of other uses, etc.).
Linked Data based on URIs is reusable. ( https://5stardata.info )
The Schema.org Health and Life Sciences extension is ahead of the game here, IMHO. MedicalObservationalStudy and MedicalTrial are subclasses of https://schema.org/MedicalStudy . {DoubleBlindedTrial, InternationalTrial, MultiCenterTrial, OpenTrial, PlaceboControlledTrial, RandomizedTrial, SingleBlindedTrial, SingleCenterTrial, and TripleBlindedTrial} are subclasses of schema.org/MedicalTrial.
A schema.org/MedicalScholarlyArticle (a subclass of https://schema.org/ScholarlyArticle ) can have a https://schema.org/Dataset. https://schema.org/hasPart is the inverse of https://schema.org/isPartOf .
More structured predicates which indicate the degree to which evidence supports/confirms or disproves current and other hypotheses (according to a particular Person or Persons on a given date and time; given a level of scrutiny of the given information) are needed.
In regards to epistemology, there was some work on Fact Checking ( e.g. https://schema.org/ClaimReview ) in recent times. To quote myself here, from https://news.ycombinator.com/item?id=15528824 :
> In terms of verifying (or validating) subjective opinions, correlational observations, and inferences of causal relations; #LinkedMetaAnalyses of documents (notebooks) containing structured links to their data as premises would be ideal. Unfortunately, PDF is not very helpful in accomplishing that objective (in addition to being a terrible format for review with screen reader and mobile devices): I think HTML with RDFa (and/or CSVW JSONLD) is our best hope of making at least partially automated verification of meta analyses a reality.
"#LinkedReproducibility"; "#LinkedMetaAnalyses", "#StudyGraph"
CSV 1.1 – CSV Evolved (for Humans)
Well, if you want to improve tabular data formats:
1. Add a version identifier / content-type on the first line!
2. Create a formal grammar for this CSV format
3. Specify preferred character-encoding
4. Provide some tooling (validation, CSV 1.1 => HTML, CSV => Excel)
5. Add the option to specify column type (string, int, date)
6. Specify ISO-8601 as the preferred date format
7. Allow 'reheading' the columns in the file itself. This is useful in streaming data.
8. Specify the format of the newlines.
CSVW: CSV on the Web https://w3c.github.io/csvw/
"CSV on the Web: A Primer" http://www.w3.org/TR/tabular-data-primer/
"Model for Tabular Data and Metadata on the Web" http://www.w3.org/TR/tabular-data-model/
"Generating JSON from Tabular Data on the Web" (csv2json) http://www.w3.org/TR/csv2json/
"Generating RDF from Tabular Data on the Web" (csv2rdf) http://www.w3.org/TR/csv2rdf/
...
N. Allow authors to (1) specify how many header rows are metadata and (2) what each row is. For example: 7 metadata header rows: {column label, property URI [path], datatype URI, unit URI, accuracy, precision, significant figures}
With URIs, we can merge, join, and concatenate data (when e.g. study control URIs for e.g. single/double/triple blinding/masking indicate that the https://schema.org/Dataset meets meta-analysis inclusion criteria).
"#LinkedReproducibility"; "#LinkedMetaAnalyses"
Ask HN: Which plants can be planted indoors and easily maintained?
Chlorophytum comosum (spider plants) are good air-filtering houseplants that are also easy to take starts of: https://en.wikipedia.org/wiki/Chlorophytum_comosum
Houseplant: https://en.wikipedia.org/wiki/Houseplant
Graduate Student Solves Quantum Verification Problem
"Classical Verification of Quantum Computations" Mahadev. (2018)
https://arxiv.org/abs/1804.01082
https://www.arxiv-vanity.com/papers/1804.01082/
https://scholar.google.com/scholar?cluster=10138991277567750...
The down side to wind power
>To estimate the impacts of wind power, Keith and Miller established a baseline for the 2012‒2014 U.S. climate using a standard weather-forecasting model. Then, they covered one-third of the continental U.S. with enough wind turbines to meet present-day U.S. electricity demand. The researchers found this scenario would warm the surface temperature of the continental U.S. by 0.24 degrees Celsius, with the largest changes occurring at night when surface temperatures increased by up to 1.5 degrees. This warming is the result of wind turbines actively mixing the atmosphere near the ground and aloft while simultaneously extracting from the atmosphere’s motion.
I am confused: How does the warming work exactly and is this actually a global climate effect? Because this part of the article makes it sound to me as if it's just a very localised change of temperature caused by the exchange of different air layers, which can't be right? Because you couldn't really compare that to climate change on a global scale.
The example is clearly hypothetical only. We're never going to cover one third of the continental US with wind turbines.
The more important information to me is that neither wind nor solar have the power density that has been claimed.
For wind, we found that the average power density — meaning the rate of energy generation divided by the encompassing area of the wind plant — was up to 100 times lower than estimates by some leading energy experts
...
For solar energy, the average power density (measured in watts per meter squared) is 10 times higher than wind power, but also much lower than estimates by leading energy experts.
Then you have the separate problem that the wind doesn't always blow and the sun doesn't always shine, so you need a huge storage infrastructure (batteries, presumably) alongside the wind and solar generating infrastructure.
IMO nuclear is the only realistic alternative to coal to provide reliable, zero-emission "base load" power generation. Wind and solar could make sense in some use cases but not in general.
> IMO nuclear is the only realistic alternative to coal to provide reliable, zero-emission "base load" power generation. Wind and solar could make sense in some use cases but not in general.
How much heat energy does a reactor with n meters of concrete around it, located on a water supply in order to use water in an open closed loop, protected with national security resources, waste into the environment?
I'd be interested to see which power sources the authors of this study would choose as a control for these just sensational stats.
From https://news.ycombinator.com/item?id=17806589 :
> Canada (2030), France (2021), and the UK (2025) are all working to entirely phase out coal-fired power plants for very good reasons (such as neonatal health).
Would you burn a charcoal grill in an enclosed space like a garage? No.
Thermodynamics of Computation Wiki
"Quantum knowledge cools computers: New understanding of entropy" (2011) https://www.sciencedaily.com/releases/2011/06/110601134300.h...
> The new study revisits Landauer's principle for cases when the values of the bits to be deleted may be known. When the memory content is known, it should be possible to delete the bits in such a manner that it is theoretically possible to re-create them. It has previously been shown that such reversible deletion would generate no heat. In the new paper, the researchers go a step further. They show that when the bits to be deleted are quantum-mechanically entangled with the state of an observer, then the observer could even withdraw heat from the system while deleting the bits. Entanglement links the observer's state to that of the computer in such a way that they know more about the memory than is possible in classical physics.
"The thermodynamic meaning of negative entropy" (2011) https://www.nature.com/articles/nature10123
Landauer's principle: https://en.wikipedia.org/wiki/Landauer%27s_principle
"Thin film converts heat from electronics into energy" (2018) http://news.berkeley.edu/2018/04/16/thin-film-converts-heat-...
> This study reports new records for pyroelectric energy conversion energy density (1.06 Joules per cubic centimeter), power density (526 Watts per cubic centimeter) and efficiency (19 percent of Carnot efficiency, which is the standard unit of measurement for the efficiency of a heat engine).
"Pyroelectric energy conversion with large energy and power density in relaxor ferroelectric thin films" (2018) https://www.nature.com/articles/s41563-018-0059-8
Carnot heat engine > Carnot cycle, Carnot's theorem, "Real heat engines": https://en.wikipedia.org/wiki/Carnot_heat_engine
Carnot's theorem > Applicability to fuel cells and batteries: https://en.wikipedia.org/wiki/Carnot%27s_theorem_(thermodyna...
> Since fuel cells and batteries can generate useful power when all components of the system are at the same temperature [...], they are clearly not limited by Carnot's theorem, which states that no power can be generated when [...]. This is because Carnot's theorem applies to engines converting thermal energy to work, whereas fuel cells and batteries instead convert chemical energy to work.[6] Nevertheless, the second law of thermodynamics still provides restrictions on fuel cell and battery energy conversion
[deleted]
Is there enough heat energy from a datacenter to -- rather than heating oceans (which can result in tropical storms) -- turn a turbine (to convert heat energy back into electrical energy)?
Is there a statistic which captures the amount of heat energy discharged into ocean/river/lake water? "100% clean energy with PPAs (Power Purchase Agreements)" while bleeding energy into the oceans isn't quite representative of the total system.
"How to Reuse Waste Heat from Data Centers Intelligently" (2016) https://www.datacenterknowledge.com/archives/2016/05/10/how-...
> There are two big issues with data center waste heat reuse: the relatively low temperatures involved and the difficulty of transporting heat. Many of the reuse applications to date have used the low-grade server exhaust heat in an application physically adjacent to the data center, such as a greenhouse or swimming pool in the building next door. This is reasonable given the relatively low temperatures of data center return air, usually between 28° and 35°C (80-95°F), and the difficulty in moving heat around. Moving heat energy frequently requires insulated ducting or plumbing instead of cheap, convenient electrical cables. Trenching and installation to run a hot water pipe from a data center to a heat user may cost as much as $600 per linear foot. Just the piping to share heat with a facility one-quarter mile away might add $750,000 or more to a data center construction project. There’s currently not much that can be done to reduce this cost.
> To address the low-temperature issue, some data center operators have started using heat pumps to increase the temperature of waste heat, making the thermal energy much more valuable, and marketable. Waste heat coming out of heat pumps at temperatures in the range of 55° to 70°C (130-160°F) can be transferred to a liquid medium for easier transport and can be used in district heating, commercial laundry, industrial process heat, and many more. There are even High Temperature (HT) and Very High Temperature (VHT) heat pumps capable of moving low-grade data center heat up to 140°C.
Heat Pump: https://en.wikipedia.org/wiki/Heat_pump
"Data Centers That Recycle Waste Heat" https://www.datacenterknowledge.com/data-centers-that-recycl...
Why Do Computers Use So Much Energy?
> Also, to foster research on this topic we have built a wiki, combining lists of papers, websites, events pages, etc. We highly encourage people to visit it, sign up, and start improving it; the more scientists get involved, from the more fields, the better!
Thermodynamics of Computation Wiki https://centre.santafe.edu/thermocomp/Santa_Fe_Institute_Col...
Justice Department Sues to Stop California Net Neutrality Law
Sorry if I get some basic understanding of the law wrong but...
Isn't this the same thing as regulating car emissions? Doesn't 822 only apply to providers in the state itself? Wouldn't it be that the telecoms are welcome to engage in another method of end-customer billing in other states?
What am I missing?
> Like California’s auto emissions laws that forced automakers to adopt the standards for all production, the state’s new net neutrality rules could push broadband providers to apply the same rules to other states.
I think you're right, the motives are very similar.
> Attorney General Jeff Sessions said that California’s net neutrality law was illegal because Congress granted the federal government, through the F.C.C., the sole authority to create rules for broadband internet providers. “States do not regulate interstate commerce — the federal government does,” Mr. Sessions said in a statement.
I thought Republicans were pro-states rights and limited government? How does their position on this jive with their ideology?
Expansion of federal jurisdiction under the Commerce Clause is an egregious violation of Constitutional law.
Does the federal government have the enumerated right under the Commerce Clause to, for example, ban football for anyone that doesn't have a disability? No!
Was the Commerce Clause sufficient authorization for Federal prohibition of alcohol? No! An Amendment to the Constitution was necessary. And, Federal Alcohol and the unequal necessary State Alcohol prohibitions miserably failed to achieve the intended outcomes.
Where is the limit? How can they claim to support a states' rights, limited government position while expanding jurisdiction under the Interstate Commerce Clause? "Substantially affecting" interstate commerce is a very slippery slope.
Furthermore, de-classification from Title II did effectively - as the current administration's FCC very clearly argued (in favor of special interests over those of the majority) - relieve the FCC of authority to regulate ISPs: they claimed that it's FTC's job and now they're claiming it's their job.
Without Title II classification, FCC has no authority to preempt state net neutrality regulation. California and Washington have the right to regulate ISPs within their respective states.
Outrageous!
Limited government: https://en.wikipedia.org/wiki/Limited_government
States' rights: https://en.wikipedia.org/wiki/States%27_rights
[Interstate] Commerce Clause: https://en.wikipedia.org/wiki/Commerce_Clause
Net neutrality in the United States > Repeal of net neutrality policy: https://en.m.wikipedia.org/wiki/Net_neutrality_in_the_United...
The limit is established, and constantly reevaulated by the Supreme Court. For example it was held that gun control cannot be done through the Commerce Clause.
Car emissions and ISPs are different. As ISPs are very much perfect examples of truly local things (they need to reach your devices with EM signals either via cables or air radio), the Federal government might try to argue that the net neutrality regulation of California affects the whole economy substantially, because it allows too much interstate competition due to the lack of bundling/throttling by ISPs.
Similarly, the problem with car emissions might be that requiring thing at the time of sale affects what kind of cars are sold to CA.
Is the Commerce Clause too vague? Yes. Is there a quick and sane way to fix it? I see none. Is it at least applied consistently? Well, sort of. But we shall see.
ISPs are the very opposite of local, as the only reason I have an ISP is to deliver bits from the rest of the world. Of course, the FCC doesn't seem to understand that...
To summarize the points made in [1]: products can be sold across state lines, internet service sold in one state cannot be sold across state lines.
[1] https://news.ycombinator.com/item?id=18111651
In my opinion, the court has significantly erred in redefining interstate commerce to include (1) intrastate-only-commerce; and (2) non-commerce (i.e. locally grown and unsold wheat)
Furthermore - and this is a bit off topic - unalienable natural rights (Equality, Life, Liberty, and pursuit of Happiness) are of higher precedence. I mention this because this is yet another case where the court will be interpreting the boundary between State and Federal rights; and it's very clear that the founders intended for the powers of the federal government to be limited -- certainly not something that the Commerce Clause should be interpreted to supersede.
What penalties and civil fines are appropriate for States or executive branch departments that violate the Constitution; for failure to uphold Oaths to uphold the Constitution?
The problem is, someone has to interpret what kind of economy the Founders intended.
Is it okay if a State opts to withdraw from the interstate market for wheat? Because without power to meddle with intra-state production, consumption and transactions, it's entirely possible.
White House Drafts Order to Probe Google, Facebook Practices
This sounds like they want the equal representation policies that the Republican Party got rolled back in the 80s (ruled unconstitutional iirc). It’s what allowed the rise of partisan “news”. It seems like any “equal exposure” policies would hit the same issues.
That said the primary “imbalanced exposure” seems to be due to evicting people who simply spend their time attacking minorities, attack equal rights, and promoting violence towards anyone that they dislike. For whatever reason the Republican Party seems to have decided that those people represent “conservative” views that private companies should have to support.
i. e. : People who are exercising their right to free speech.
'Right to free speech' does not exist outside the government. It never has, unless there's an amendment to the first amendment that no one is telling us about..
Repeating what I said in an earlier comment....they were able to grow to the size they have become because they are exempted from liable laws under safe harbor. The argument for that was that they were neutral platforms. They no longer are so they either need to remove the protections or be subject to the first amendment but they should not be able to have it both ways.....and I don't think there is case law to back this one way or the other yet...
> they were able to grow to the size they have become because they are exempted from liable laws under safe harbor
This was not a selective protection. When the government grants limited resources like electromagnetic spectrum and right of way, they're not directly making a monopoly, but the FCC does then claim right to regulate speech.
In the interest of fairness, the FCC classed telecommunication service providers as common carriers; thus authorizing FCC to pass net neutrality protections which require equal prioritization of internet traffic. (No blocking, No throttling, No paid prioritization). The current administration doesn't feel that that's fair, and so they've moved to dismantle said "burdensome regulations".
The current administration is now apparently attempting to argue that information service providers - which are all equally granted safe harbor and obligated to comply with DMCA - have no right to take down abuse and harassment because anti-trust monopoly therefore Freedom of Speech doesn't apply to these corporation persons.
Selective bias, indeed! Broadcast TV and Radio are subject to different rules than Cable (non-broadcast) TV.
Other regimes have attempted to argue that the government has the right to dictate the media as well.
Taking down abuse and harassment is necessary and well within the rights of a person and a corporation in the United States. Taking down certain content is now legally required within 24 hours of notice from the government in the EU.
Where is the line between a media conglomerate that produces news entertainment and an information service provider? If there is none, and the government has the right to regulate "equal time" on non-granted-spectrum media outlets, future administrations could force ConservativeNewsOutletZ and LiberalNewsOutletZ to carry specific non-emergency content, to host abusive and offense rhetoric, and to be sued for being forced to do so because no safe harbor.
Can anyone find the story of how the GOP strongarmed and intimidated Facebook into "equal time" (and then we were all shoved full of apparently Russian conservative "fake news" propaganda) before the most recent election where the GOP won older radio, TV, and print voters and young people didn't vote because it appeared to be unnecessary?
Meanwhile, the current administration rolled back the "burdensome regulation" that was to prevent ISPs from selling complete internet usage history; regardless of age.
Maybe there's an exercise that would be helpful for understanding the "corporate media filter" and the "social media filter"?
You, having no money -- while watching corporate profits soar and income inequality grow to unprecedented heights -- will choose to take a job that requires you to judge whether thousands of reported pieces of content a day are abusive, harassing, making specific threats, inciting specific destructive acts, recruiting for hate groups, depicting abuse; or just good 'ol political disagreement over issues, values, and the appropriate role of the punishing and/or nurturing state. You will do this for weeks or months, because that's your best option, because nobody else is standing in the mirror behind these people who haven't learned to respectfully disagree over facts and data (evidence).
Next, you will plan segments of content time interspersed with ads paid for by people who are trying to sell their products, grow their businesses, and reach people. You will use a limited amount of our limited electromagnetic spectrum which the government has sold your corporate overlords for a limited period of time, contingent upon your adherence to specific and subjective standards of decency as codified in the stated regulations.
In both cases, your objective is to maximize profit for shareholders.
Your target audiences may vary from undefined (everyone watching), to people who only want to review fun things that they agree with in their safe little microcosm of the world, to people who know how to find statistics like corporate profits, personal savings rate, infant morality, healthcare costs per capita, and other Indicators identified as relevant to the Targets and Goals found in the UN Sustainable Development Goals (Global Goals Indicators).
Do you control what the audience shares?
Ask HN: Books about applying the open source model to society
I've been thinking for some time now that as productivity keeps growing, not all people will need to work any more. Society will eventually start to resemble an open source project where a few core contributors do the real work (and get to decide the direction), some others help around, and the majority of people just benefit without having to do anything. I'm wondering if any books have been written to explore this concept further?
> I've been thinking for some time now that as productivity keeps growing, not all people will need to work any more.
How much energy do autotrophs and heterotrophs need to thrive?
"But then we'll be rewarding laziness!"
Some people do enjoy the work they've chosen to do. We enjoy the benefits of upward mobility here in the US; the land of opportunity.
Why would I fully retire at 65 (especially if lifespan extension really is in reach)?
> Society will eventually start to resemble an open source project where a few core contributors do the real work (and get to decide the direction), some others help around, and the majority of people just benefit without having to do anything.
Open-source governance https://en.wikipedia.org/wiki/Open-source_governance
Free-rider problem https://en.wikipedia.org/wiki/Free-rider_problem
As we continue to reward work, the people who are investing in the means of production (energy, labor, automation, raw materials) and science (research and development; education) continue to amass wealth and influence.
This concentration of wealth -- wealth inequality -- has historically presaged and portended unrest.
How contributions to open source projects are reinforced, what motivates people who choose to contribute (altruism, enlightened self interest, compassion, acceptance,), and what makes a competitive and thus sustainable open source project is an interesting study.
... Business models for open-source software: https://en.wikipedia.org/wiki/Business_models_for_open-sourc...
... Political Science: https://en.wikipedia.org/wiki/Political_science
... National currencies are valued in FOREX markets: https://en.wikipedia.org/wiki/Foreign_exchange_market
> I'm wondering if any books have been written to explore this concept further?
"The Singularity is Near: When Humans Transcend Biology" (2005) contains a number of extrapolated predictions; chief among these is that there will continue to be exponential growth in technological change https://en.wikipedia.org/wiki/The_Singularity_Is_Near
... Until we reach limits; e.g. the carrying capacity of our ecosystem, the edge of the universe.
"The Limits to Growth" (1972, 2004) https://en.wikipedia.org/wiki/The_Limits_to_Growth
"Leverage Points: Places to Intervene in a System" (2010) https://news.ycombinator.com/item?id=17781927
Who owns what and who 'gets to' just chill while the solar robots brush their teeth? Heady questions. "Tired yet?"
The Aragon Project has a really interesting take on open source governance:
""" IMAGINE A NATION WITHOUT LAND AND BORDERS
A digital jurisdiction
> Aragon Network will be the first community governed decentralized organization whose goal is to act as a digital jurisdiction, an online decentralized court system that isn’t bound by traditional artificial barriers such as national jurisdictions or the borders of a single country.
Aragon organizations can be upgraded seamlessly using our aragonOS architecture. They can solve disputes between two parties by using the decentralized court system, a digital jurisdiction that operates only online and utilizes your peers to resolve issues.
The Aragon Network Token, ANT, puts the power into the hands of the people participating in the operation of the Network. Every single aspect of the Network will be governed by those willing to make an effort for a better future. """
Today, Europe Lost The Internet. Now, We Fight Back
Here's a quote from this excellent article:
> An error rate of even one percent will still mean tens of millions of acts of arbitrary censorship, every day.
And a redundant -- positively defiant -- link and page title:
"Today, Europe Lost The Internet. Now, We Fight Back." https://www.eff.org/deeplinks/2018/09/today-europe-lost-inte...
Firms with 50 or less employees should stay that small, really.
VPN providers in North and South America FTW.
> VPN providers in North and South America FTW.
Article 13 will affect the entire Internet, not just people in Europe. Most people on the Internet use large, multinational platforms. Those platforms will set rules according to the lowest common denominator, because it's the easiest to implement.
This means that people all over the world are going to have a much more difficult time with any user-generated content. That's true even for user-created content, even if there is no other copyright holder involved (look at how badly Content ID has played out).
Or they just stop doing business in Europe.
Since europe is a quite big market, maybe not an option and easier to geolocate and restrict just EU-Traffic
Or just not, if there's no nexus to europe. I suspect for a small business, the right thing to do would be to simply ignore EU directives. I would not be surprised if the US passes a law to make judgements against US companies without a European nexus (users in Europe would not count) unenforceable in the US. That was done with the SPEECH act to stop libel tourism.
Technically, the phrase "Useful Arts and Sciences" in the Copyright Clause of the US Constitution applies to just that; the definitions of which have coincidentally changed over the years.
The harms to Freedom of Speech -- i.e. impossible 99% accuracy in content filtering still results in far too much censorship -- so significantly outweigh the benefits for a limited number of special interests intending to thwart inferior American information services which also currently host "art" and content pertaining to the "useful arts"; that it's hard to believe this new policy will have it's intended effects.
Haven't there been multiple studies which show that free marketing from e.g. content piracy -- people who experience and recommended said goods at $0 -- is actually a net positive for the large corporate entertainment industry? That, unimpeded, content spreads like the common cold through word of mouth; resulting in greater number of artful impressions.
How can they not anticipate de-listing of EU content from news and academic article aggregators as an outcome of these new policies? (Resulting in even greater outsized impact on one possible front page that consumers can choose to consume)
For countries in the EU with less than 300 million voters, if you want:
- time for your headline: $
- time for your snippet: $$
- time for your og:description: $$
- free video hosting: $$$
- video revenue: $$$$
- < 30% American content: $$$$$
Pay your bill.
And what of academic article aggregators? Can they still index schema:ScholarlyArticle titles and provide a value-added information service for science?
Consumer science (a.k.a. home economics) as a college major
> That's why we need to bring back the old home economics class. Call it "Skills for Life" and make it mandatory in high schools. Teach basic economics along with budgeting, comparison shopping, basic cooking skills and time management.
Some Jupyter notebooks for these topics that work with https://mybinder.org could be super helpful. A self-paced edX course could also be a great intro to teaching oneself though online learning.
* Personal Finance (budgets, interest, growth, inflation, retirement)
* Food Science (nutrition, meal planning for n people, food prep safety, how long certain things can safely be left out on the counter)
* Productivity Skills (GTD, context switching overhead, calendar, email labels, memo app / shared task lists)
There were FACS (Family and Consumer Studies/Sciences) courses in our middle and high school curricula. Nutrition, cooking, sewing; family planning, carry a digital baby for awhile
Home economics https://en.wikipedia.org/wiki/Home_economics
* Family planning
https://en.wikipedia.org/wiki/Family_planning
> * Personal Finance (budgets, interest, growth, inflation, retirement)
Personal Finance https://en.wikipedia.org/wiki/Personal_finance
Khan Academy > College, careers, and more > Personal finance https://www.khanacademy.org/college-careers-more/personal-fi...
"CS 007: Personal Finance For Engineers" https://cs007.blog
https://reddit.com/r/personalfinance/wiki
> * Food Science (nutrition, meal planning for n people, food prep safety, how long certain things can safely be left out on the counter)
Food Science https://en.wikipedia.org/wiki/Food_science
Dietary management https://en.wikipedia.org/wiki/Dietary_management
Nutrition Education: https://en.wikipedia.org/wiki/Nutrition_Education
MyPlate https://en.wikipedia.org/wiki/MyPlate
Healthy Eating Plate https://www.hsph.harvard.edu/nutritionsource/healthy-eating-...
How to make salads, smoothies, sandwiches
How to compost and avoid unnecessary packaging
* School, College, Testing, "How Children Learn"
GED, SAT, ACT, MCAT, LSAT, GRE, GMAT, ASVAB
Defending a Thesis, Bar Exam, Boards
Khan Academy > College, careers, and more https://www.khanacademy.org/college-careers-more
Educational Testing https://wrdrd.github.io/docs/consulting/educational-testing
529 Plans (can be used for qualifying educational expenses for any person) https://en.wikipedia.org/wiki/529_plan
Middle School "Glimpse" project: Past, Present, Future. Present, Future: plan your 4-year highschool course plan, pick 3 careers, pick 3 colleges (and how much they cost)
High school literature: write a narrative essay for college admissions
* Health and Medicine
How to add emergency contact and health information to your phone, carseat (ICE: In Case of Emergency)
How to get health insurance ( https://healthcare.gov/ )
"What's your blood type?" (?!)
Khan Academy > Science > Health and Medicine https://www.khanacademy.org/science/health-and-medicine
Facebook vows to run on 100 percent renewable energy by 2020
Miami Will Be Underwater Soon. Its Drinking Water Could Go First
Now, now, let's focus on the positives here:
- more pollution from shipping routes through the Arctic circle (and yucky-looking icebergs that tourists don't like)
- less beachfront property
- more desalinatable water
- hotter heat
- more revulsive detestable significant others (displaced global unrest)
- costs of responding to natural disasters occurring with greater frequency due to elevated ocean temperatures
- less parking spaces (!)
What are the other costs and benefits here?
I've received a number of downvotes for this comment. I think it's misunderstood, and that's my fault: I should have included [sarcasm] around the whole comment [/sarcasm].
I've written about our need to address climate change here in past comments. I think the administration's climate change denials (see: "climate change politifact') and regulatory rollbacks are beyond despicable: they're sabotaging the United States by allowing more toxic chemicals into the environment that we all share, and allowing more sites that must be protected with tax dollars that aren't there because these industries pay far less than benchmarks in terms of effective tax rate. We know that vehicle emissions, mercury, and coal ash are toxic: why would we allow people to violate the rights of others in that way?
A person could voluntarily consume said toxic byproducts and not have violated their own rights or the rights of others, you understand. There's no medical value and low potential for abuse, so we just sit idly by while they're violating the rights of other people by dumping toxic chemicals into the environment that are both poisonous and strongly linked to climate change.
What would help us care about this? A sarcastic list of additional reasons that we should care? No! Miami underwater during tourist season is enough! I've had enough!
So, my mistake here - my downvote-earning mistake - was dropping my generally helpful, hopeful tone for cynicism and sarcasm that wasn't motivating enough.
We need people to regulate pollution in order to prevent further costs of climate change. Water in the streets holds up commerce, travel, hampers national security, and destroys the road.
We must stop rewarding pollution if we want it - and definitely resultant climate change - to stop. What motivates other people to care?
Free hosting VPS for NGO project?
The Burden: Fossil Fuel, the Military and National Security
Here's a link to the video: https://vimeo.com/194560636
Scientists Warn the UN of Capitalism's Imminent Demise
The actual document title: "Global Sustainable Development Report 2019 drafted by the Group of independent scientists: Invited background document on economic transformation, to chapter: Transformation: The Economy" (2018) https://bios.fi/bios-governance_of_economic_transition.pdf [PDF]
Why I distrust command economies (beyond just because of our experiences with violent fascism and defense overspending and the subsequent failures of various communist regimes):
We have elections today. We don't choose to elect people that regard the environment (our air, water, land, and other natural resources) as our most important focus. A command economy driven by these folks for longer than a term limit would be even more disastrous.
The market does not solve for 'externalities': things that aren't costed in. We must have regulation to counteract the blind optimization for profit (and efficiency) which capitalism rewards most.
Environmental regulation is currently insufficient; worldwide. That is the consensus from the Paris Agreement which 195 countries signed in 2015. https://en.wikipedia.org/wiki/Paris_Agreement
Maybe incentives?
We could sell tokens for how much pollutants we're allowed to f### everyone else over with and penalize exceeding the amount we've purchased. That would incentivize firms to pollute less so that they can save money by having to buy fewer tokens. (Europe does this already; and it's still not going to save the planet from industrial production externalities)
So, while I'm wary of any suggestion that a command economy would somehow bring forth talent in governance, I look to this article for actionable suggestions that penalize and/or incentivize sustainable business and living practices.
Sustainable reporting really is a must: how can I design an investment portfolio that excludes reckless, irresponsible, indifferent, and careless investments and highly values sustainability?
No one likes to be driven by harsh penalties; everyone likes to be rewarded (even with carrots as incentives).
Markets do not solve for long term outcomes. Case in point: the market has not chosen the most energy efficient cryptocurrencies. Is this an information asymmetry issue: people just don't know, or just don't care because the incentives are so alluring, the brand is so strong, or the perceived security assurances of the network outweighs the energy use (and environmental impact) in comparison to dry cleaning and fossil fuel transport.
How would a command economy respond to this? It really is denial and delusion to think that the market will cast aside less energy efficient solutions in order to save the environment all on its own.
So, what do we do?
Do we incentivize getting inefficient vehicles off of the road and into a recycling plant where they belong?
Do we shut down major sources of pollution (coal plants, vehicle emissions)?
Do we create tokens to account for pollution allowances (for carbon and other toxic f###ing chemicals)?
Do we cut irrational subsidies for industries that don't pay their taxes (even when they make money); so that we're aware of the actual costs of our behavior?
Do we grow hemp to absorb carbon, clean up the soil, replace emissions, and store energy?
Who's in the mood to dom these greedy shortsighted idiots into saving themselves and preventing the violation of our right to health (life)? No, you can't because you're busy violating your own rights and finding drugs/druggies and that's not allowed? Is that a lifetime position?
"Go burn a charcoal grill and your gas vehicle in your closed garage for awhile and come talk to me." That's really what we're dealing with here.
Anyways, this paper raises some good points; although I have my doubts about command economies.
[strikethrough] You can't do that to yourself. [/strikethrough] You can't do that to others (even if you pay for their healthcare afterwards).
Where's Captain Planet when you need 'em, anyways?
The problem may actually be solved by a free market... But it has to have different optimizations.
"Value" or "Quality" of a company has to stop being measured in unconstrained growth and instead measured in minimized externalities, while achieving your stated goal.
Accounting has to become a hell of a lot more complicated. We're talking "keep track of your industrial waste product", possibly even having to take responsibility in some manner for dealing with it directly.
There also needs to be some societal change as well. We need to Look at how we've stretched our supply and industry chains worldwide and start to figure out ways to minimize transportation and production costs.
Competition may need to fundamentally change it's nature as well. Trade secrets need to be ripped out into the light of day. It's not a contest of out doing the other guy, but to see who can find a way to make the entire industry more efficient.
Definitely a lot of change needed. That is for sure.
Firefox Nightly Secure DNS Experimental Results
> The experiment generated over a billion DoH transactions and is now closed. You can continue to manually enable DoH on your copy of Firefox Nightly if you like.
...
> Using HTTPS with a cloud service provider had only a minor performance impact on the majority of non-cached DNS queries as compared to traditional DNS. Most queries were around 6 milliseconds slower, which is an acceptable cost for the benefits of securing the data. However, the slowest DNS transactions performed much better with the new DoH based system than the traditional one – sometimes hundreds of milliseconds better.
Long-sought decay of Higgs boson observed at CERN
Furthermore, both teams measured a rate for the decay that is consistent with the Standard Model prediction, within the current precision of the measurement.
And now everyone: Noooo, not again.
(Explanation: it's well-known that the Standard Model can't be completely correct but again and again physicists fail to find an experiment contradicting its predictions, see https://en.wikipedia.org/wiki/Physics_beyond_the_Standard_Mo... for example)
Well, the standard model can be correct. It is correct until some experiment proves otherwise.
It is full of unexplained hadcoded parameters, indeed, which need an explanation from outside of the SM.
> It is full of unexplained hadcoded parameters, indeed, which need an explanation from outside of the SM.
https://en.wikipedia.org/wiki/Magic_number_(programming)#Unn...
> The term magic number or magic constant refers to the anti-pattern of using numbers directly in source code
Who is going to request changes when doing a code review on a pull request from God?
But which branch do you pull from? God has been forked so many times, it's hard to keep track. There are several distros available, so I guess you just get to pick the one that checks the most of your needs at the time.
Sen. Wyden Confirms Cell-Site Simulators Disrupt Emergency Calls
Building a Model for Retirement Savings in Python
re: pulling historical data with pandas-datareader, backtesting, algorithmic trading: https://www.reddit.com/r/Python/comments/7zxptg/pulling_stoc...
re: historical returns
- [The article uses a constant 7% annual return rate]
- "The current average annual return from 1923 (the year of the S&P’s inception) through 2016 is 12.25%." https://www.daveramsey.com/blog/the-12-reality (but that doesn't account for inflation)
- https://www.quantopian.com/posts/56b62019a4a36a79da000059 (300%+ over n years (from a down market))
Is there a Jupyter notebook with this code (with a requirements.txt for https://mybinder.org (repo2docker))?
New E.P.A. Rollback of Coal Pollution Regulations Takes a Major Step Forward
Would you move your family downwind from a coal plant? Why or why not?
Coal ash pollutes air, water, rain (acid rain), crops (our food), and soil. Which rights of victims does coal pollution infringe? Who is liable for the health effects?
Canada (2030), France (2021), and the UK (2025) are all working to entirely phase out coal-fired power plants for very good reasons (such as neonatal health).
~"They're just picking on coal": No, we're choosing renewables that are lower cost AND don't make workers and citizens sick.
If you can mine for coal, you can set up solar panels and wind turbines.
If you can run a coal mine; you can buy some cheap land, put up solar panels and wind turbines, and connect it to the grid.
Researchers Build Room-Temp Quantum Transistor Using a Single Atom
"Quasi-Solid-State Single-Atom Transistors" (2018) https://onlinelibrary.wiley.com/doi/full/10.1002/adma.201801...
New “Turning Tables” Technique Bypasses All Windows Kernel Mitigations
Um – Create your own man pages so you can remember how to do stuff
Interesting project. I believe these two shell functions in the shell's initialization file (~/.profile, ~/.bashrc, etc.) can serve as a poor man's um:
umedit() { mkdir -p ~/notes; vim ~/notes/"$1.txt"; }
um() { less ~/notes/"$1.txt"; }If you write these in .rst, you can generate actual manpages with Sphinx: http://www.sphinx-doc.org/en/master/usage/configuration.html...
sphinx.builders.manpage: http://www.sphinx-doc.org/en/master/_modules/sphinx/builders...
[deleted]
Leverage Points: Places to Intervene in a System
Extremely interesting take, with a lot of good stuff to think about.
I'm unclear about the claim that less economic growth would be better, though, and the author seems very committed to it. I wasn't able to find the article they reference as explaining how less growth is what we really need (J.W. Forrester, World Dynamics, Portland OR, Productivity Press, 1971), and it comes from almost 50 years ago, which might as well be another economic era altogether.
Does anyone know what the arguments are, what assumptions they require, and whether they still apply today? My understanding is that "less growth is better" is a distinctly minority take amongst modern economists, but the rest of this article seems very intelligently laid out, so I'd like to dig deeper.
I've always thought that for any dial we have, there's always an optimal setting, whether it's tax rates, growth rates, birth rates, etc., and blindly pushing one way or the other (like both political parties tend to do) is not helpful, or at the very least merely indicates different value systems.
The book is called "limits to growth", and it's a work of the Club of Rome. You can watch this video to have an idea of their work [1]. At first it's a very strange idea, afterall we work and consume everyday in order to grow the economy. But every healthy system as a homeostastic point, a point where it doesn't need to grow, only to be maintained. We are now a fat society and we need to get our health back, we need to degrow. We need to work much less hours and consume much less. Reducing working hours is a great leverage point. People will start to have time to care about the community, and take care of their own health. Maybe have more time to take a walk instead of using the car. This can improve health and environment but will not contribute to economic growth, healthy people that don't use cars are not good friends of economic growth based on GDP. English is not my mother language, Ursula Le Guin had an eloquent post about this on her blog, but was removed to be on the last book. The post is called "Clinging to a Metaphor" (the metaphor is economic growth) and the book is called "No Time to Spare". [1] https://youtu.be/kz9wjJjmkmc
"The Limits to Growth" (1972) https://en.wikipedia.org/wiki/The_Limits_to_Growth
"Thinking in Systems: a Primer" (2008) https://g.co/kgs/B71ebC
Glossary of systems theory https://en.wikipedia.org/wiki/Glossary_of_systems_theory
Systems Theory https://en.wikipedia.org/wiki/Systems_theory
...
Computational Thinking https://en.wikipedia.org/wiki/Computational_thinking
Which of the #GlobalGoals (UN Sustainable Development Goals) Targets and Indicators are primary leverage points for ensuring - if not growth - prosperity? https://en.wikipedia.org/wiki/Sustainable_Development_Goals
SQLite Release 3.25.0 adds support for window functions
https://www.windowfunctions.com is a good introduction to window functions.
Besides that, the comprehensive testing and evaluation of SQLite never ceases to amaze me. I'm usually hesitant to call software development "engineering", but SQLite is definitely well-engineered.
This looks great, but I couldn't get through the first question on aggregate functions. Are there any SQL books/tutorials that go over things like this?
A lot of material I've seen has been like the classic image of "How to draw an owl. First draw two circles, then draw the rest of the owl", where they tell you the super basic stuff, then assume you know everything.
Ibis uses windowing functions for aggregations if the database supports them. IDK when support for the new SQLite support will be implemented? http://docs.ibis-project.org/sql.html#window-functions
[EDIT]
I created an issue for this here: https://github.com/ibis-project/ibis/issues/1597
Update on the Distrust of Symantec TLS Certificates
Is the certifi bundle (2018.8.13) on PyPI also updated? https://pypi.org/project/certifi/
https://github.com/certifi/certifi.io/issues/18
> Are these still in the bundle?
> Should projects like requests which depend on certifi also implement this logic?
The Transport Layer Security (TLS) Protocol Version 1.3
Academic Torrents – Making 27TB of research data available
All the .torrent files are served over http so with a simple MITM attack a bad actor could swap in their own custom tweaked version of any data set here in order to achieve whatever goals that might serve for the bad actor's interests.
I really wish we could get basic security concepts added to the default curriculum for grade schoolers. You shouldn't need a PhD in computer security to know this stuff. These site creators have PhDs in other fields, but obviously no concept of security. This stuff should be basic literacy for everyone.
> This stuff should be basic literacy for everyone.
Arguably, one compromised PKI x.509 CA jeopardizes all SSL/TLS channel sec if there's no certificate pinning and an alternate channel for distributing signed cert fingerprints (cryptographically signed hashes).
We could teach blockchain and cryptocurrency principles: private/secret key, public key, hash verification; there there's money on the table.
GPG presumes secure key distribution (`gpg --verify .asc`).
TUF is designed to survive certain role key compromises. https://theupdateframework.github.io
1/0 = 0
1/0 = 1(±∞)
https://twitter.com/westurner/status/960508624849244160
> How many times does zero go into any number? Infinity. [...]
> How many times does zero go into zero? infinity^2?
Power Worth Less Than Zero Spreads as Green Energy Floods the Grid
I don't truly understand this "problem". I understand storing the energy in batteries is currently very expensive economically and materially.
However I believe there are plenty of "goods" (irrespective of if they are bulk materials, or partially processed products) which have a high processing energy per volume ratio (this does not need to be recoverable stored energy).
Allow me to give an example: currently we have a drought in Belgium (or at least Flanders). We are not landlocked, there is plenty of water in the sea. Desalination is energy intensive. Instead of only looking at energy storage, why can't we increase the processing capacity (more desalination sites capable of working in parallel), and desalinate say sea water during the energy flood? I don't expect this to be an ideal real world example, only a pattern for identifying such examples: any product (could be composite parts, or bulk material) which is relatively compact and has some high energy per product volume processinng step. Just do the process (desalination, welding some part to another part...) when the sun shines, and store them for later.
Products with very high step energy density are good candidates for storing, and could help flatten daily variations, and perhaps even seasonal variations!
Now some companies would prefer avoiding risk if they don't have guaranteed orders far enough into the future, then perhaps there should be a market for insurance or loans, so that the company is encouraged to take the risk, instead of wasting the cheap energy...
Capital costs vs. marginal costs.
You're going to build a $100M desalination plant and run it for three hours a day? That's a ton of money sitting idle most of the day, far more than what is recovered with zero operating costs.
(This is called the utilization factor -- how long a piece of equipment is used vs. staying idle)
Ideally you want useful processes with low capital costs and expensive marginal/energy costs. Desalination is not one of those.
A desal plant is a useful thing and can be run around the clock off traditional energy sources. The "free energy" hours would help lower the costs.
I can even imagine a SETI-like application where people who over-generate power are able to donate it to causes of their preference...
But if you are using traditional energy sources to run it all the time, it is no longer 'solving' the problem of burning off excess energy during peak renewable times.
Someone is having to build a lot of highly wasteful, redundant infrastructure.
Rational cryptocurrency mining firms can use the excess (unstorable) energy by converting it back to money (while the sun shines and the wind blows).
Money > Energy > Money
> Someone is having to build a lot of highly wasteful, redundant infrastructure.
We're nowhere near having the energy infrastructure necessary to support everyone having an electric vehicle yet.
Energy storage is key to maximizing returns from renewables and minimizing irreversible environmental damage.
Kernels, a free hosted Jupyter notebook environment with GPUs
Hey Ben, are these going to support arbitrary CUDA?
At the moment, we're focused on providing great support for the Python and R analytics/machine learning ecosystems. We'll likely expand this in the future, and in the meantime it's possible to hack through many other usecases we don't formally support well.
How do you handle custom environment requirements, whether it’s Python version, library version, or more complex things in the environment that some code might run on?
Basically, suppose I wanted everything that I could define in a Docker container to be available “as the environment” in which the notebook is running. How do I do that?
I ask because I’ve started to see an alarming proliferation of “notebook as a service” platforms that don’t offer that type of full environment spec, if they offer any configuration of the run time environment at all.
I’ve taught probability and data science at university level and worked in machine learning in a variety of businesses too, and I’d say for literally all use cases, from the quickest little pure-pedagogy prototype of a canned Keras model to a heavily customized use case with custom-compiled TensorFlow, different data assets for testing vs ad hoc exploration vs deployment, etc., the absolutely minimum thing needed before anything can be said to offer “reproducibility” is complete specification of the run time environment and artifacts.
The trend to convince people that a little “poke around with scripts in a managed environment” offering is value-additive is dangerous, very similar to MATLAB’s approach to entwine all data exploration with the atrocious development havits that are facilitated by the console environment (and to specifically target university students with free licenses, to use a drug dealer model to get engineers hooked on MATLAB’s workflow model and use that to leverage employers to oblige by buying and standardizing on abjectly bad MATLAB products).
Any time I meet young data scientists I always try to encourage them to avoid junk like that. It’s vital to begin experiments with fully reproducible artifacts like thick archive files or containers, and to structure code into meaningful reproducible units even for your first ad hoc explorations, and to absolutely always avoid linear scripting as an exploratory technique (it is terrible and ineffective for such a task).
Kaggle Kernels seems like a cool idea, so long as the programmer must fully define artifacts that describe the complete entirety of the run time environment, and nobody is sold on the Kool Aid of just linear scripting in some other managed environment.
Each kernel for example could have a link back to a GitHub repo containing a Dockerfile and build scripts for what defined the precise environment the notebook is running in. Now that’s reproducible.
Here are the Kaggle Kernels Dockerfiles:
- Python: https://github.com/Kaggle/docker-python/blob/master/Dockerfi...
- R: https://github.com/Kaggle/docker-rstats/blob/master/Dockerfi...
https://mybinder.org builds containers (and launches free cloud instances) on demand with repo2docker from a (commit hash, branch, or tag) repo URL: https://repo2docker.readthedocs.io/en/latest/config_files.ht...
That’s a great first step! Adding the ability to customize on a per-notebook basis would be impressive.
Solar and wind are coming. And the power sector isn’t ready
I don't know that fatalism and hopelessness are motivating for decision makers (who are seeking greater margins regardless of policy and lobbies).
Is our transformation to 100% clean energy ASAP a certain eventuality? On a long enough timescale, it would be irrational for utilities to not choose both lower cost and more sustainable environmental impact ('price-rational', 'environment-rational').
We should expect storage and generation costs to continue to fall as we realize even just the current pipeline of capitalizable [storage] research.
Solar energy is free.
Solar Just Hit a Record Low Price in the U.S
Relevant bits:
> “On their face, they’re less than a third the price of building a new coal or natural gas power plant,” Ramez Naam, an energy expert and lecturer at Singularity University, told Earther in an email. “In fact, building these plants is cheaper than just operating an existing coal or natural gas plant.”
> There’s a 30 percent federal investment tax credit for solar projects that helps drive down the cost of this and other solar projects. But Naam said even if you take away that credit, “these bids, un-subsidized, are still cheaper than any new coal or gas plants, and possibly cheaper than operating existing plants.”
(emphasis mine)
>> Relevant bits:
>> “On their face, they’re less than a third the price of building a new coal or natural gas power plant,” Ramez Naam, an energy expert and lecturer at Singularity University, told Earther in an email. “In fact, building these plants is cheaper than just operating an existing coal or natural gas plant.”
>> There’s a 30 percent federal investment tax credit for solar projects that helps drive down the cost of this and other solar projects. But Naam said even if you take away that credit, “these bids, un-subsidized, are still cheaper than any new coal or gas plants, and possibly cheaper than operating existing plants.”
I'm assuming that's without factoring in the health cost externalities.
Yes, this is solely for cost of power. The healthcare savings and quality of life improvements are an additional bonus on top of very cheap power.
https://www.sciencenews.org/article/air-pollution-triggering... (Air pollution is triggering diabetes in 3.2 million people each year)
https://www.scientificamerican.com/article/the-other-reason-... (The Other Reason to Shift away from Coal: Air Pollution That Kills Thousands Every Year)
Causal Inference Book
Causal inference (Causal reasoning) https://en.wikipedia.org/wiki/Causal_inference ( https://en.wikipedia.org/wiki/Causal_reasoning )
Tim Berners-Lee is working a platform designed to re-decentralize the web
Does anyone have some links on Solid that aren't media articles? I can't find anything, not even in Tim's homepage.
Home page: https://solid.mit.edu/
Spec: https://github.com/solid/solid-spec
Source: https://github.com/solid/solid
...
From https://news.ycombinator.com/item?id=16615679 ( https://westurner.github.io/hnlog/#comment-16615679 )
> ActivityPub (and OStatus, and ActivityStreams/Salmon, and OpenSocial) are all great specs and great ideas. Hosting and moderation cost real money (which spammers/scammers are wasting).
> Know what's also great? Learning. For learning, we have the xAPI/TinCan spec and also schema.org/Action.
Mastodon has now supplanted GNU StatusNet.
More States Opting to 'Robo-Grade' Student Essays by Computer
edX can automate short essay grading with edx/edx-ora2 "Open Response Assessment Suite" [1] and edx/ease "Enhanced AI scoring engine" [2].
1: https://github.com/edx/edx-ora2 2: https://github.com/edx/ease
... I believe there's also a tool for peer feedback.
Peer feedback/grading on MOOCs is pretty bad in my experience. There’s too much diversity of skills, language ability, etc. And too many people who bring their own biases and mostly ignore any grading instructions.
Peer discussion and feedback are useful in things like college classes. Much less so with MOOCs.
Ask HN: Looking for a simple solution for building an online course
I want to build an online course on graph algorithms for my university. I've tried to find a solution which would let submit, execute and test student's code (implement an online judge), but have had no success. There are a lot of complex LMS and none of them seem to have this feature as a basic functionality.
Are there any good out-of-box solutions? I'm sure I can build a course using Moodle or another popular LMS with some plugin, but I don't want to spend my time customizing things.
I'm interested both in platforms and self-hosted solutions. Thanks!
Maybe look at Jupyter Notebook? It does much of this out of the box, but may not be exactly what you are looking for.
nbgrader is a "A system for assigning and grading Jupyter notebooks." https://github.com/jupyter/nbgrader
jupyter-edx-grader-xblock https://github.com/ibleducation/jupyter-edx-grader-xblock
> Auto-grade a student assignment created as a Jupyter notebook, using the nbgrader Jupyter extension, and write the score in the Open edX gradebook
... networkx is a graph library written in Python which has pretty good docs: https://networkx.github.io/documentation/stable/reference/
There are a few books which feature networkx.
There is now a backprop principle for deep learning on quantum computers
"A Universal Training Algorithm for Quantum Deep Learning" https://www.arxiv-vanity.com/papers/1806.09729/
New research a ‘breakthrough for large-scale discrete optimization’
Looks like it may be this paper:
"An Exponential Speedup in Parallel Running Time for Submodular Maximization without Loss in Approximation" https://www.arxiv-vanity.com/papers/1804.06355/
The ACM STOC 2018 conference links to "The Adaptive Complexity of Maximizing a Submodular Function" http://dl.acm.org/authorize?N651970 https://scholar.harvard.edu/files/ericbalkanski/files/the-ad...
A DOI URI would be great, thanks.
Wind, solar farms produce 10% of US power in the first four months of 2018
Take note: 10% PRODUCED, not 10% CONSUMED.
This is counting all output by wind and solar regardless if it is needed and usable when the power is being produced. This is quite important because wind and solar are not on-demand sources of power.
> This is counting all output by wind and solar regardless if it is needed and usable when the power is being produced. This is quite important because wind and solar are not on-demand sources of power.
I think you have that backwards: in the US, we lack the ability to scale down coal and nuclear plants. Solar and Wind are generally the first to get pulled offline when generated capacity exceeds demand and storage.
TIL this is called "curtailment" and it's an argument that utilities have used to justify not spending on renewables that are saving the environment from global warming (which is going to require more electricity for air conditioning).
Solar energy production peaks around noon. Demand for electricity peaks in the evening. We need storage (batteries with supercapacitors out front) in order to store the difference between peak generation and peak use. Because they're unable to store this extra energy, they temporarily shut down solar and wind and leave the polluting plants online.
Consumers aren't exposed to daily price fluctuations: they get a flat rate that makes it easy to check their bill; so there's no price incentive to e.g. charge an EV at midday when energy is cheapest.
The 'Duck curve' shows this relation between peak supply and demand in electricity markets: https://en.wikipedia.org/wiki/Duck_curve
Developing energy storage capabilities (through infrastructure and open access basic research that can be capitalized by all) is likely the best solution. According to a fairly recent report, we could go 100% renewable with the energy storage tech that exists today.
But there's no money for it. There's money for subsidizing oil production (regardless of harms (!)), but not so much for wind and solar. There's money for responding to natural disasters caused by global warming, but not so much for non-carbon-based energy sources that don't cause global warming. A film called "The Burden: Fossil Fuel, the Military, and National Security" quotes the actual unsubsidized price of a gallon of gasoline.
Wouldn't it be great if there was some kind of computer workload that could be run whenever energy is cheapest ( 'energy spot instances') so that we can accelerate our migration to renewable energy sources that are saving the environment for future generations? If there were people who had strong incentives to create demand for power-efficient chips and inexpensive clean energy.
Where would be if we had continued with Jimmy Carter's solar panels on the roof of the White House (instead of constant war and meddling with competing oil production regions of the world)?
It's good to see wind and solar growing this fast this year. A chart with cost per kWhr or MWhr would be enlightening.
FDA approves first marijuana-derived drug and it may spark DEA rescheduling
Perhaps someone from the US could help me understand the federal vs state legality of cannabis in the USA?
Can a state override any federal law?
Could federal-level law enforcement theoretically charge someone in a cannabis-legal state for drug offenses?
states cannot override any federal law.
the federal government merely tolerates the semi-autonomous nature of states in order to maintain order. but the legality of the federal government's supremacy is well evolved, its power and might is extremely disproportionally higher than any collective of states, and at this point the semi-autonomous nature of states is pure fiction as the federal government can assume jurisdiction over any intra-state matter if it wanted to through its interstate commerce laws.
yes, federal level law enforcement can theoretically charge someone in a cannabis-legal state for drug offenses. this still happens. the discretion of the words of the President, the heads of the DEA and DOJ prevent the MAJORITY of it from happening and also guide the discretion of the courts and public sympathy. so right now, for the last two administrations it has not been a priority to upset the social order in cannabis-legal states. but it can still happen.
> states cannot override any federal law.
Not actually true: https://en.wikipedia.org/wiki/Nullification_(U.S._Constituti...
> not actually true
a concept which requires agreement from the federal courts themselves, who have never upheld a single argument related to this concept.
not actually what?
Selective incorporation: https://en.wikipedia.org/wiki/Incorporation_of_the_Bill_of_R...
10th Amendment: https://en.wikipedia.org/wiki/Tenth_Amendment_to_the_United_...
> The Tenth Amendment, which makes explicit the idea that the federal government is limited to only the powers granted in the Constitution, has been declared to be a truism by the Supreme Court.
Supremacy clause: https://en.wikipedia.org/wiki/Supremacy_Clause
> The Supremacy Clause of the United States Constitution (Article VI, Clause 2) establishes that the Constitution, federal laws made pursuant to it, and treaties made under its authority, constitute the supreme law of the land.[1] It provides that state courts are bound by the supreme law; in case of conflict between federal and state law, the federal law must be applied. Even state constitutions are subordinate to federal law.[2] In essence, it is a conflict-of-laws rule specifying that certain federal acts take priority over any state acts that conflict with federal law
Natural rights ('inalienable rights': Equal rights, Life, Liberty, pursuit of Happiness): https://en.wikipedia.org/wiki/Natural_and_legal_rights
9th Amendment: https://en.wikipedia.org/wiki/Ninth_Amendment_to_the_United_...
> The Ninth Amendment (Amendment IX) to the United States Constitution addresses rights, retained by the people, that are not specifically enumerated in the Constitution. It is part of the Bill of Rights.
If the 9th Amendment recognizes any unenumerated rights of the people (with Supremacy, regardless of selective incorporation), it certainly recognizes those of the Declaration of Independence ((secession from the king ('CSA')), Equality, Life, Liberty, pursuit of Happiness), our non-binding charter which frames the entirety of the Constitutional convention
All of these things have been interpreted by the courts and their conclusion was not like yours
It is nice that you are interested in these things, but they simply cannot be read verbatim and then extrapolated to other things.
This isn't educational for anybody, this is a view that lacks all consensus and all avenues to ever garner consensus in this country.
Again, I ask you to explain how the current law grants equal rights.
https://news.ycombinator.com/item?id=17401906
> We tend to have issues with Equal rights/protections: slavery, voting rights, [school] segregation. Please help us understand how to do this Equally:
>> Furthermore, (1) write a function to determine whether a given Person has a (natural inalienable) right: what information may you require? (2) write a function to determine whether any two Persons have equal rights.
Abolitionists faced similar criticism from on high.
States Can Require Internet Tax Collection, Supreme Court Rules
I have a hunch that this will, in the end, be a massive win for large retailers vs. small ones. The task of figuring out how to calculate tax for all states is more or less the same amount of work regardless of size, which means for someone like Amazon it's more or less trivial, but for a mom-and-pop store it's a major hassle.
Thankfully with Shopify it is extremely easy and straightforward to manage for my wife's small online store. Their platform does a great job properly charging taxes by state, county and city in certain situations. Then using an inexpensive plan from https://www.taxjar.com/ the entire filing and paying process is 100% automated.
In 10 minutes I was able to file and pay all the sales taxes to several state, dozens of California counties and a handful of cities that charge additional taxes on top.
So you assume. As a small business you are unlikely to be audited, but that software could easily be wrong creating a huge minefield and potential liability.
So that would be the software companies' liability. And such business practice can differentiate good ones from the bad ones. I'm seeing a new business market here even.
https://en.wikipedia.org/wiki/Parable_of_the_broken_window
There is zero economic gain from more complex tax rules. Further, the software does not absolve you of liability. At best they may agree to cover it, but that's unlikely and they can also go broke if they get it wrong.
Actually, current sales tax software provided by South Dakota and other states does absolve a merchant for liability if used to calculate sales tax due.
I agree with this approach!
Actually, the federal government should oblige each member state to provide the algorithm, and sign it cryptographically and have it expire every X fixed time interval, and have signed algorithms for the current and next time interval, so that software can automatically fetch and stay up to date.
Then the "business opportunity" of navigating FUD evaporates. Currently any such enterprise charging for such a service can spend a fraction of their budget lobbying against harmonization...
Since it would be an obligation of the states to the federal government, these algorithms (provided by each member state) should be hosted on a fixed federal government site.
Time to start a petition?
This would reduce costs of tax collection for all parties.
What is the most convenient format for this layered geographic data? Are the tax district boundary polygons already otherwise available as open data? What do localities call these? Sales tax tables, sales tax database, machine-readable flat files in an open format with a common schema?
How much tax revenue should it cost to provide such a service on a national level?
States, Counties, Cities, 'Tax Zones'(?) could be required to host tax.state.us.gov or similar with something like Project Open Data JSONLD /data.json that could be aggregated and shared by a server with a URL registry, a task queue service, and a CDN service.
While the Bitcoin tax payments bill passed the Senate and House in Arizona, it was vetoed in May 2018. Seminole County in Florida now allows tax payment with crytocurrencies such as Bitcoin:
https://cointelegraph.com/news/us-seminole-county-florida-to...
> According to a press release, the county will begin accepting Bitcoin (BTC) and Bitcoin Cash (BCH) to pay for services, including property taxes, driver license and ID card fees, as well as tags and titles. The Seminole County Tax Collector will reportedly employ blockchain payments company BitPay, which will allow the county to receive settlement the next business day directly to its bank account in US dollars.
This could also help reduce the costs of tax collection and possibly increase the likelihood of compliance with the forthcoming tax bills!
these are all very good questions, and only a community discussion of people with the right skills and interests can draft a petition, if enough people contribute to the discussion we can make the proposal more reasonable and robust against valid criticisms... but I believe we can make this happen by just starting the discussion. We can bitch on Hacker News, or we can draft a proposal for the different government levels. The more reasonable we draft it, the higher the probability the petition will be a success. I think it wouldn't be hard to argue against this proposal that a legally enforced computation should be open source, i.e. not just the algorithm for the computation but also all the data lists and boundary polygons used in the algorithm...
Ask HN: Do you consider yourself to be a good programmer?
if not, why? how do you validate your achievements?
> For identifying strengths and weaknesses: "Programmer Competency Matrix":
> - http://sijinjoseph.com/programmer-competency-matrix/
> - https://competency-checklist.appspot.com/
> - https://github.com/hltbra/programmer-competency-checklist
Don’t read too much into that. TDD for example is not leveling up it’s an opinionated approach to development.
Automated testing is not a choice in many industries.
If you're not familiar with TDD, you haven't yet achieved that level of mastery.
There's a productivity boost to being able to change quickly without breaking things.
Is all unit/functional/integration testing and continuous integrating TDD? Is it still TDD if you write the tests after you write the function (and before you commit/merge)?
I think this competency matrix is a helpful resource. And I think that learning TDD is an important thing for a good programmer.
There is absolutely no need to follow TDD to be good at testing.
This is all unfounded conjecture: it seems easier to remember which parameter combinations may exist and need to be tested when writing the function; so "let's all write tests later" becomes a black box exercise which is indeed a helpful perspective for review, but isn't the most effective use of resources.
IMHO being convinced that there's only one true and correct methodology (TDD, Scrum, etc.) or paradigm (functional, objective, reactive programming, etc.) is a sign of being a bad programmer.
A good programmer finds common attributes and behaviors and organizes them into namespaced structs/arrays/objects with functions/methods and tests. Abstractly, which terms should we use to describe hierarchical clusters of things with information and behaviors if not those from a known software development or project management methodology?
And a good programmer asks why people might have spent so much time formalizing project development methodologies. "What sorts of product (team) failures are we dealing with here?" is an expensive question to answer as a team.
By applying tenets of Named agile software development methodologies, teams and managers can feel like they're discussing past and current experiences/successes/failures with comparable implementations of approaches that were or are appropriate for different contexts.
To argue the other side, just cherry picking from different methodologies is creating a new methodology, which requires time to justify basically what we already have terms for on the wall over here.
"We just pop tasks off the queue however" is really convenient for devs but can be kept cohesive by defining sensible queues: [kanban] board columns can indicate task/issue/card states and primacy, [sprint] milestone planning meetings can yield complexity 'points' estimates for completable tasks and their subtasks. With team velocity (points/time), a manager can try to appropriately schedule optimal paths of tasks (that meet the SMART criteria (specific, measurable, achievable, relevant, and Time-bound)); instead of fretting with the team over adjusting dates on a Gantt chart (task dependency graph) deadline, the team can
What about your testing approach makes it 'NOT TDD'?
How long should the pre-release static analysis and dynamic analyses take in my fancy DevOps CI TDD with optional CD? Can we release or deploy right now? Why or why not?
'We can't release today because we spent too much time arguing about quotes like "A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines." ("Self Reliance" 1841. Emerson) and we didn't spec out the roof trusses ahead of time because we're continually developing a new meeting format, so we didn't get to that, or testing the new thing, yet.'
A good programmer can answer the three questions in a regular meeting at any time, really:
> 1. What have you completed since the last meeting?
> 2. What do you plan to complete by the next meeting?
> 3. What is getting in your way?
And:
Can we justify refactoring right now for greater efficiency or additional functionality?
The simple solution there is to simply not use specific parameters (outside ovious edge-cases, ie supplying -1 and 2^63 into your memory allocator). Writing a simple reproducible fuzzer is easy for most contained functions.
I find blackbox testing itself also fairly useful. The part where you forget which parameter combinations may occur can be useful since you now A) rely on documentation you made and B) can write your test independent of how you implemented it just like if you had written it beforehand. (Just don't forget to avoid falling into the 'write test to pass function' trap)
IMHO, it's so much easier to write good, comprehensive tests while writing the function (FUT: function under test) because that information is already in working memory.
It's also easier to adversarially write tests with a fresh perspective.
I shouldn't need to fuzz every parameter for every commit. Certainly for releases.
"Building an AppSec Pipeline: Keeping your program, and your life, sane" https://www.owasp.org/index.php/OWASP_AppSec_Pipeline
I mean, I general don't think you should write fuzzers for absolutely everything (most contained functions => doesn't touch a lot of other stuff and few parameters with a known parameter space)
The general solution is to use whatever testing methodology you are comfortable, that is very effective, very efficient and covers a lot of problem space. Of course no testing method does that so you'll have to constantly balance whatever works best (which is why I think pure TDD is overrated)
> Is all unit/functional/integration testing and continuous integrating TDD?
No. They differentiate in the matrix.
> If you're not familiar with TDD, you haven't yet achieved that level of mastery.
That's not true - I've worked on teams with far lower defect rates than the typical TDD team.
TDD can help keep a developer focused - and this can help overall productivity rates - but it doesn't directly help lower defect rates.
> TDD can help keep a developer focused - and this can help overall productivity rates - but it doesn't directly help lower defect rates.
We would need to reference some data with statistical power; though randomization and control are infeasible: no two teams are the same, no two projects are the same, no two objective evaluations of different apps' teams' defect rates are an apples to apples comparison.
Maybe it's the coverage expectation: do not add code that is not run by at least one test.
Handles are the better pointers
A few years ago I wanted to make a shoot-em-up game using C# and XNA I could play with my kids on an Xbox. It worked fine except for slight pauses for GC every once in a while which ruined the experience.
The best solution I found was to get rid of objects and allocation entirely and instead store the state of the game in per-type tables like in this article.
Then I realized that I didn't need per-type tables at all, and if I used what amounted to a union of all types I could store all the game state in a single table, excluding graphical data and textures.
Next I realized that since most objects had properties like position and velocity, I could update those with a single method by iterating through the table.
That led to each "object" being a collection of various properties acted on by independent methods which were things like gravity and collision detection. I could specify that a particular game entity was subject to gravity or not, etc.
The resulting performance was fantastic and I found I could have many thousands of on-screen objects at the time with no hiccups. The design also made it really easy to save and replay game state; just save or serialize the single state table and input for each frame.
The final optimization would have been to write a language that would define game entities in terms of the game components they were subject to and automatically generate the single class that was union of all possible types and would be a "row" in the table. I didn't get around to that because it just wasn't necessary for the simple game I made.
I was inspired to do this from various articles about "component based game design" published at the time that were variations on this technique, and the common thread was that a hierarchical OO structure ended up adding a lot of unneeded complexity for games that hindered flexibility as requirements changed or different behaviors for in-game entities were added.
Edit: This is a good article on the approach. http://cowboyprogramming.com/2007/01/05/evolve-your-heirachy...
> The final optimization would have been to write a language that would define game entities in terms of the game components they were subject to and automatically generate the single class that was union of all possible types and would be a "row" in the table
django-typed-models https://github.com/craigds/django-typed-models
> polymorphic django models using automatic type-field downcasting
> The actual type of each object is stored in the database, and when the object is retrieved it is automatically cast to the correct model class
...
> the common thread was that a hierarchical OO structure ended up adding a lot of unneeded complexity for games that hindered flexibility as requirements changed or different behaviors for in-game entities were added.
So, in order to draw a bounding box for an ensemble of hierarchically/tree/graph-linked objects (possibly modified in supersteps for reproducibility), is an array-based adjacency matrix still fastest?
Are sparse arrays any faster for this data architecture?
> django-typed-models https://github.com/craigds/django-typed-models
Interesting, never came across it. But I've got a Django GitHub library (not yet open source, because it's in a project I've been working with on and off) that does the same for managing GitHub's accounts. An Organization and User inherit the same properties and I downcast them based on their polymorphic type field.
ContentType.model_class(), models.Model.meta.abstract=True, django-reversion, django-guardian
IDK how to do partial indexes with the Django ORM? A simple filter(bool, rows) could probably significantly shrink the indexes for such a wide table.
Arrays are fast if the features/dimensions are known at compile time (if the TBox/schema is static). There's probably an intersection between object reference overhead and array copy costs.
Arrow (with e.g. parquet on disk) can help minimize data serialization/deserialization costs and maximize copy-free data interoperability (with columnar arrays that may have different performance characteristics for whole-scene transformation operations than regular arrays).
Many implementations of SQL ALTER TABLE don't have to create a full copy in order to add a column, but do require a permission that probably shouldn't be GRANTed to the application user and so online schema changes are scheduled downtime operations.
If you're not discovering new features at runtime and your access pattern is generally linear, arrays probably are the fastest data structure.
Hacker News also has a type attribute that you might say is used polymorphically: https://github.com/HackerNews/API/blob/master/README.md#item...
Types in RDF are additive: a thing may have zero or more rdf:type property instances. RDF quads can be stored in one SQL table like:
_id,g,s,p,o,xsd:datatype,xml:lang
... with a few compound indexes that are combinations of (s,p,o) so that triple pattern graph queries like (?s,?p,1) are fast. Partial indexes (SQLite, PostgreSQL,) would be faster than full-table indexes for RDF in SQL, too.
Neural scene representation and rendering
This work is a natural progression from a lot of other prior work in the literature... but that doesn't make the results any less impressive. The examples shown are amazingly, unbelievably good! Really GREAT WORK.
Based on a quick skim of the paper, here is my oversimplified description of how this works:
During training, an agent navigates an artificial 3D scene, observing multiple 2D snapshots of the scene, each snapshot from a different vantage point. The agent passes these snapshots to a deep net composed of two main parts: a representation-learning net and a scene-generation net. The representation-learning net takes as input the agent's observations and produces a scene representation (i.e., a lower-dimensional embedding which encodes information about the underlying scene). The scene-generation network then predicts the scene from three inputs: (1) an arbitrary query viewpoint, (2) the scene representation, and (3) stochastic latent variables. The two networks are trained jointly, end-to-end, to maximize the likelihood of generating the ground-truth image that would be observed from the query viewpoint. See Figure 1 on Page 15 of the Open Access version of the paper. Obviously I'm playing loose with language and leaving out numerous important details, but this is essentially how training works, as I understand it based on a first skim.
EDIT: I replaced "somewhat obvious" with "natural," which better conveys what I actually meant to write the first time around.
I, literally just 15 minutes ago, had a chat with a friend of mine exactly about how what we are doing right now with computer vision is all based on a flawed premise (supervised 2D training set). The human brain works in 3D space (or 3D+time) and then projects all this knowledge in a 2D image.
Here I was, thinking I finally had thought of a nice PhD project and then Deepmind comes along and gets the scoop! Haha.
"Spatial memory" https://en.wikipedia.org/wiki/Spatial_memory
It may be splitting hairs, but I think the mammalian brain, at least, can simulate/remember/imagine additional 'dimensions' like X/Y/Z spin, derivatives of velocity like acceleration/jerk/jounce.
Is space 11 dimensional (M string theory) or 2 dimensional (holographic principle)? What 'dimensions' does the human brain process? Is this capacity innate or learned; should we expect pilots and astronauts to have learned to more intuitively cognitively simulate gravity with their minds?
New US Solar Record – 2.155 Cents per KWh
"Cost of electricity by source" https://en.wikipedia.org/wiki/Cost_of_electricity_by_source
"Electricity pricing" https://en.wikipedia.org/wiki/Electricity_pricing
> United States 8 to 17 ; 37[c] 43[c] [cents USD/kWh]
Ask HN: Is there a taxonomy of machine learning types?
Besides classification and regression, and the unsupervised methods for principle components, clustering and frequent item-sets, what tools are there in the ML toolkit and what kinds of problems are amenable to their use?
Outline of Machine Learning https://en.wikipedia.org/wiki/Outline_of_machine_learning
Machine learning # Applications https://en.wikipedia.org/wiki/Machine_learning#Applications
"machine learning map" image search: https://www.google.com/search?q=machine+learning+map&tbm=isc...
Senator requests better https compliance at US Department of Defense [pdf]
The "Mozilla SSL Configuration Generator" has a checkbox for 'HSTS enabled?' and can generate SSL/TLS configs for Apache, Nginx, Lighttpd, HAProxy, AWS, ELB. https://mozilla.github.io/server-side-tls/ssl-config-generat...
You can select 'nginx', then 'modern', and then 'apache' for a modern Apache configuration.
Are the 'modern' configs FIPS compliant?
What browsers/tools does requiring TLS 1.3 break?
Banks Adopt Military-Style Tactics to Fight Cybercrime
> In a windowless bunker here, a wall of monitors tracked incoming attacks — 267,322 in the last 24 hours, according to one hovering dial, or about three every second — as a dozen analysts stared at screens filled with snippets of computer code.
> Cybercrime is one of the world’s fastest-growing and most lucrative industries. At least $445 billion was lost last year, up around 30 percent from just three years earlier, a global economic study found, and the Treasury Department recently designated cyberattacks as one of the greatest risks to the American financial sector.
Is this type of monitoring possible (necessary, even) with blockchains? Blockchains generally silently disregard bad/invalid transactions. Where could discarded/disregarded transactions and forks be reported to in a decentralized blockchain system? Who would pay for log storage? How redundantly replicated should which data be?
How DDOS resistant are centralized and decentralized blockchains?
Exchanges have risk. In terms of credit fraud: some crypto asset exchanges do allow margin trading, many credit card companies either refuse transactions with known exchanges or charge cash advance interest rates, and all transactions are final.
Exchanges hold private keys for customers' accounts, move a lot to offline cold storage, and maybe don't do a great job of explaining that YOU SHOULD NOT LEAVE MONEY ON AN EXCHANGE. One should transfer funds to a different account; such as a hardware or paper wallet or a custody service.
Do/can crypto asset exchanges participate in these exercises? To what extent do/can blockchains help solve for aspects of our unfortunately growing cybercrime losses?
Premined blockchains could reportedly handle card/chip/PIN transaction volumes today.
No, Section 230 Does Not Require Platforms to Be “Neutral”
> It’s foolish to suggest that web platforms should lose their Section 230 protections for failing to align their moderation policies to an imaginary standard of political neutrality. Trying to legislate such a “neutrality” requirement for online platforms—besides being unworkable—would be unconstitutional under the First Amendment.
... https://en.wikipedia.org/wiki/Section_230_of_the_Communicati...
Ask HN: Do battery costs justify “buy all sell all” over “net metering”?
Are batteries the primary justification for "buy all sell all" over "net metering"?
Are next-gen supercapacitors the solution?
> Ask HN: Do battery costs justify "buy all sell all" over "net metering"?
> Are batteries the primary justification for "buy all sell all" over "net metering"?
> Are next-gen supercapacitors the solution?
With "Net Metering", electric utilities buy consumers' excess generated energy at retail or wholesale rates. https://en.wikipedia.org/wiki/Net_metering
With "Buy All, Sell All", electric utilities require consumers to sell all of the energy they generate from e.g. solar panels (usually at wholesale prices, AFAIU) and buy all of the energy they consume at retail rates. They can't place the meter after any local batteries.
Do I have this right?
Net metering:
(used-generated) x (retail || wholesale)
Buy all, sell all:
(used x retail) - (generated x wholesale)
For the energy generating consumer, net metering is a better deal: they have power when the grid is down, and they keep or earn more for the energy generation capability they choose to invest in.
Break-even on solar panels happens sooner with net metering.
Utilities argue that: maintaining grid storage and transfer costs money, which justifies paying energy generating consumers less than they pay for more constant sources of energy like dams, wind farms, and commercial solar plants.
Building a two-way power transfer grid costs money. Batteries require replacement after a limited number of cycles. Spiky or bursting power generation is not good for batteries because they don't get a full cycle. [Hemp] supercapacitors can smooth out that load and handle many more partial charge and discharge cycles.
Is energy storage the primary justifying cost driver for "buy all, sell all"?
What investments are needed in order to more strongly incentivize clean energy generation? Do we need low cost supercapacitors to handle the spiky load?
Are these utilities granted a monopoly? Are they price fixing?
Energy demand from blockchain mining has not managed to keep demand constant so that utilities have profit to invest in clean energy generation and a two-way smart grid that accommodates spiky consumer energy generation. Demand for electricity is falling as we become less wasteful and more energy efficient. As the cost of renewable energy continues to fall (and become less expensive than nonrenewables), there should be more margin for energy utilities which cost-rationally and environmentally-rationally choose to buy renewable energy and sell it to consumers.
Please correct me with the appropriate terminology.
How can we more strongly incentivize consumer solar panel investments?
Here's a discussion about the lower costs of hemp supercapacitors as compared with graphene super capacitors: https://news.ycombinator.com/item?id=16800693
""" Hemp supercapacitors might be a good solution to the energy grid storage problem. Hemp absorbs carbon, doesn't leave unplowable roots in the fields, returns up to 70% of nutrients to the soil, and grows quickly just about anywhere. Hemp bast fiber is normally waste. Hemp anodes for supercapacitors are made from the bast fiber that is normally waste.
Graphene is very useful; but industrial production of graphene is dangerous because lungs and blood-brain barrier.
Hemp is an alternative to graphene for modern supercapacitors (which now have much greater [energy density] in wH/kg)
"Hemp Carbon Makes Supercapacitors Superfast” https://www.asme.org/engineering-topics/articles/energy/hemp...
> “Our device’s electrochemical performance is on par with or better than graphene-based devices,” Mitlin says. “The key advantage is that our electrodes are made from biowaste using a simple process, and therefore, are much cheaper than graphene.”
> Graphene is, however, expensive to manufacture, costing as much as $2,000 per gram. [...] developed a process for converting fibrous hemp waste into a unique graphene-like nanomaterial that outperforms graphene. What’s more, it can be manufactured for less than $500 per ton.
> Hemp fiber waste was pressure-cooked (hydrothermal synthesis) at 180 °C for 24 hours. The resulting carbonized material was treated with potassium hydroxide and then heated to temperatures as high as 800 °C, resulting in the formation of uniquely structured nanosheets. Testing of this material revealed that it discharged 49 kW of power per kg of material—nearly triple what standard commercial electrodes supply, 17 kW/kg.
https://scholar.google.com/scholar?hl=en&q=hemp+supercapacit....
https://en.wikipedia.org/wiki/Supercapacitor
I feel like a broken record mentioning this again and again. ""'
Portugal electricity generation temporarily reaches 100% renewable
Currently we are passing through an atypical wind and rain period. Our dams are full and pouring out water as we are producing more energy than we consume.
Despite all of this energy prices don't drop as they are bound to one of the few legal monopolies in Europe: the grid operation cartel (REN in the article, there is only one allowed for each EU country).
Also this feels as fake news since we have a few very old (and historically insecure, like sines power plant and carregado power plant) fossil fuel power plants that are still operating with no signs of slowing down while making profits through several scams in chemical engineering their way into the 0 emissions lot.
are batteries (or some sort of storage) in the infrastructure plan for the future?
Hemp supercapacitors might be a good solution to the energy grid storage problem. Hemp absorbs carbon, doesn't leave unplowable roots in the fields, returns up to 70% of nutrients to the soil, and grows quickly just about anywhere.
Hemp bast fiber is normally waste. Hemp anodes for supercapacitors are made from the bast fiber that is normally waste.
Graphene is very useful; but industrial production of graphene is dangerous because lungs and blood-brain barrier.
Hemp is an alternative to graphene for modern supercapacitors (which now have much greater power density in wH/kg)
"Hemp Carbon Makes Supercapacitors Superfast” https://www.asme.org/engineering-topics/articles/energy/hemp...
> “Our device’s electrochemical performance is on par with or better than graphene-based devices,” Mitlin says. “The key advantage is that our electrodes are made from biowaste using a simple process, and therefore, are much cheaper than graphene.”
> Graphene is, however, expensive to manufacture, costing as much as $2,000 per gram. [...] developed a process for converting fibrous hemp waste into a unique graphene-like nanomaterial that outperforms graphene. What’s more, it can be manufactured for less than $500 per ton.
> Hemp fiber waste was pressure-cooked (hydrothermal synthesis) at 180 °C for 24 hours. The resulting carbonized material was treated with potassium hydroxide and then heated to temperatures as high as 800 °C, resulting in the formation of uniquely structured nanosheets. Testing of this material revealed that it discharged 49 kW of power per kg of material—nearly triple what standard commercial electrodes supply, 17 kW/kg.
https://scholar.google.com/scholar?hl=en&q=hemp+supercapacit...
https://en.wikipedia.org/wiki/Supercapacitor
I feel like a broken record mentioning this again and again.
If you're going to be mentioning this again in the future please correct your usage of power/energy density. Power density is measured in W/kg, energy density is measured in Wh/kg. Supercapacitors tend to excel in the former but be poor in the latter. You mentioned power density but used units for energy density. This happens so often in media that I feel the need to correct it even in a comment.
> please correct your usage of power/energy density. Power density is measured in W/kg, energy density is measured in Wh/kg. Supercapacitors tend to excel in the former but be poor in the latter.
I'd update the units; good call. You may have that confused? Traditional supercapacitors have had lower power density and faster charging/discharging. Graphene and hemp somewhat change the game, AFAIU.
It makes sense to put supercapacitors in front of the battery banks because they last so many cycles and because they charge and discharge so quickly (a very helpful capability for handling spiky wind and solar loads).
I think you may still be a little confused. Power density is the rate at which energy can be added to or drawn from the the cell per unit mass. So faster charging and discharging means high power density. Energy density is the total amount of energy that can be stored per unit mass. Supercapacitors are typically higher in power density and lower in energy density than batteries[1].
You're right that it makes sense to put supercapacitors in front of the battery banks for the reasons you said.
[1] http://berc.berkeley.edu/storage-wars-batteries-vs-supercapa...
I must have logically assumed that rate of charge and discharge include time (hours) in the unit: Wh/kg.
My understanding is that there's usually a curve over time t that represents the charging rate from empty through full.
[edit]
"C rate"
Battery_(electricity)#C_rate https://en.wikipedia.org/wiki/Battery_(electricity)#C_rate
Battery_charger#C-rates https://en.wikipedia.org/wiki/Battery_charger#C-rates
> Charge and discharge rates are often denoted as C or C-rate, which is a measure of the rate at which a battery is charged or discharged relative to its capacity. As such the C-rate is defined as the charge or discharge current divided by the battery's capacity to store an electrical charge. While rarely stated explicitly, the unit of the C-rate is [h^−1], equivalent to stating the battery's capacity to store an electrical charge in unit hour times current in the same unit as the charge or discharge current.
It does sound amazing and economical, like almost too good to be true, but I very much hope it is true. What are the downsides? Is there a degradation problem or something similar? Other than the stoner connection in people's minds, what kind of resistance is there to this? Why isn't it widely known?
You know, I'm not sure. This article is from a few years ago now and there's not much uptake.
It may be that most people dismiss supercapacitors based on the stats for legacy (pre-graphene/pre-hemp) supercapacitors: large but quick and long-lasting.
It may be that hemp is taxed at up to 90% because it's a controlled substance in the US (but not in Europe, Canada, or China; where we must import shelled hemp seeds from). A historical accident?
GPU Prices Drop ~25% in March as Supply Normalizes
How do these new GPUs compare to those from 10 years ago in terms of FLOPs per Watt? https://en.wikipedia.org/wiki/Performance_per_watt
The new ASICs for Ethereum mining can't be solely responsible for this percent of the market.
(Note that NVIDIA's stock price is up over 1700% over the past 10 years. And that Bitcoin mining on CPUs and GPUs hasn't been profitable for quite awhile. In 2007, I don't think we knew that hashing could be done on GPUs; though there were SSL accelerator cards that were mighty expensive)
Apple says it’s now powered by renewable energy worldwide
As this article [1] explains, Apple does not (and cannot) actually run on 100% renewable energy globally, as any of its stores/premises/facilities that are connected to local municipal power grids will use whatever power generation method is used on that grid, and that is still likely to be fossil fuel in most locations.
But they purchase Renewable Energy Certificates to offset their use of non-renewable energy, so they can make the claim that their net consumption of non-renewable electricity is negative.
[1] https://www.theverge.com/2018/4/9/17216656/apple-renewable-e...
This is weird. So if I run a wind farm that generates 1 MW(...h? why is this an energy unit and not power? but whatever...) and I use all of that electricity myself, I can also sell 1 REC to someone else so that they claim they run on green electricity. Which means that now either (a) I have to legally claim I run on dirty electricity (which is a lie on its face and make no sense???) or (b) we both claim we run on green power, double-dipping and screwing up the accounting of greenness.
Am I misunderstanding something? How does this work?
No that's how it works. That's also the reason why Norway, despite only using hydro, has only 40-50% renewable energy in some statistics. They sell green energy certificates to consumers abroad (e.g. in Germany). Officially, Norwegians then use coal power whereas in reality it's all hydro power. There isn't even enough transmission capacity to the south to get that kind of exchange physically.
Wow! And I just realized there seems to be another loophole: that means (say) a company like Apple could start a separate power company in Norway based on hydro power, have that company completely waste 100% of the energy it produce there, and yet "buy" the equivalent REC in another jurisdiction where they run on coal and suddenly get to 100% "green" power... potentially even making more money in tax credits, if there are any, all while consuming more and more dirty power without actually helping anybody shift to renewable energy. Right?
I think it comes down to giving them credit for funding the construction of massive clean energy projects, even if they don't exclusively use the electricity from that project themselves.
Look at Microsoft. They just funded a deal to build out an absolutely massive solar farm in Virginia.
https://blogs.microsoft.com/on-the-issues/2018/03/21/new-sol...
They do have facilities in state that will only need a fraction of that power to be 100% green, and the excess will be pumped into the local Virginia power grid and used by consumers there.
Basically, Microsoft is saying that they funded the project to generate excess green energy in one place to offset the dirty energy they consume in areas where there is no local green power option available.
100% renewable energy by purchasing and funding renewable energy is an outstanding acheivement.
Is there another statistic for measuring how many KWhr or MWhr are sourced directly from renewable energy sources (or, more logically, 'directly' from batteries + hemp supercapacitors between use and generation)?
Hackers Are So Fed Up with Twitter Bots They’re Hunting Them Down Themselves
This is an interesting approach. Maybe Twitter shouldn't solve the fake accounts problem directly, maybe they should come up with an evaluation criteria and then create a market for identifying fake accounts.
If their evaluation criteria is good, they could get away with 0 cost to build the best possible system (motivated by competition on a market).
There's an open call for papers/proposals for handling the deluge. "Funding will be provided as an unrestricted gift to the proposer's organization(s)" ... "Twitter Health Metrics Proposal Submission" https://blog.twitter.com/official/en_us/topics/company/2018/...
A real hacker move would be to just leave Twitter and go to Mastodon https://joinmastodon.org
Are you suggesting that Mastodon has a better system for identifying harassment, spam, and spam accounts? Or that, given that they're mostly friendly early adopters, they haven't yet encountered the problem?
It seems to me like you don't understand the crucial difference between Twitter and Mastodon.
There's no such thing as Mastodon, a singular social network. Mastodon is a series of instances that talk to each other. A sysadmin running the instance can do whatever he pleases in his instance, including closing the registration, banning entire instances from communicating with his instance, and enforcing whichever rules he wants to enforce.
Mastodon doesn't deal with such issues at all. It's sysadmins running Mastodon instances that are supposed to deal with such issues.
It's more like reddit, where mods of subreddits have nearly complete authority over their own space on the social network, than it is like Twitter, in which a single entity is in charge.
Mastodon is a federated system like StatusNet/GNU Social.
So, in your opinion, Mastodon nodes - by virtue of being federated - would be better equipped to handle the spam and harassment volume that Twitter is subject to?
I find that hard to believe.
ActivityPub (and OStatus, and ActivityStreams/Salmon, and OpenSocial) are all great specs and great ideas. Hosting and moderation cost real money (which spammers/scammers are wasting).
Know what's also great? Learning. For learning, we have the xAPI/TinCan spec and also schema.org/Action.
“We’re committing Twitter to increase the health and civility of conversation”
First Amendment protections apply to suits brought by the government. Civil suits are required to prove damages ("quantum of loss").
There are many open platforms. (I've contributed to those as well). Some are built on open standards. None of said open platforms have procedures or resources for handling the onslaught of disrespectful trash that the people we've raised eventually use these platforms for communicating at other people who have feelings and understand the Golden Rule.
https://en.wikipedia.org/wiki/Golden_Rule
The initial early adopters (who have other better things to do) are fine: helpful, caring, critical, respectful; healthy. And then everyone else comes surging in with hate, disrespect, and vitriol; unhealthy. They don't even realize that being hateful and disrespectful is making them more depressed. They think that complaining and talking smack to people is changing the world. And then they turn off the phone or log out of the computer, and carry on with their lives.
No-one taught them to be the positive, helpful energy they want to attract from the world. No-one properly conditioned them to either respectfully disagree according to the data or sit down and listen. No-one explained to them that a well-founded argument doesn't fit in 140 or 280 characters, but a link and a headline do. No-one explained to them that what they write on the internet lasts forever and will be found by their future interviewers, investors, jurors, and voters. No-one taught them that being respectful and helpful in service of other people - of the group's success, of peaceful coexistence - is the way to get ahead AND be happy. "No-one told me that."
Shareholders of public corporations want to see growth in meaningless numbers, foreign authoritarian governments see free expression as a threat to their ever-so-fragile self-perceptions, political groups seek to frame and smear and malign and discredit (because they are so in need of group acceptance; because money still isn't making them happy), and there are children with too much free time reading all of these.
No-one is holding these people accountable: we need transparency and accountability. We need to focus on more important goals and feel good about helping; about volunteering our time to help others be happier.
Instead, now that these haters and scam artists have all self-identified, we must spend our time conditioning their communications until they learn to respectfully disagree on facts and data or go somewhere else. "That's how you feel? Great. How does that make your victim feel?" is the confrontation that some people are seeking from companies that set out to serve free speech and provide a forum for citizens to share the actual news.
Who's going to pay for that? Can they sue for their costs and losses? Advertisers do not want a spot next to hateful and disrespectful.
"How dare you speak of censorship in such veiled terms!?" Really? They're talking about taking down phrases like "kill" and "should die"; not phrases like "I disagree because:"
So, now, because there are so many hateful economically disadvantaged people in the world with nothing better to do and no idea how to run a business or keep a job with benefits, these companies need to staff 24 hour a day censors to take down the hate and terror and gang recruiting within one hour. What a distorted mirror of our divisively fractured wealth inequality, indeed.
"Ban gangs ASAP, please: they'll just go away"
How much does it cost to pay prison labor to redundantly respond to this trash? Are those the skills they need to choose a different career with benefits and savings that meet or exceed inflation when they get out?
What is the procedure for referring threats of violence to justice in your jurisdiction? Are there wealthy individuals in your community who would love to contribute resources to this effort? Maybe they have some region-specific pointers for helping the have-nots out here trolling like it's going to get them somewhere they want to be in life?
Let me share a little story with you:
A person walks into a bar/restaurant, flicks off the bartender/waiter, orders 5 glasses of free water, starts plastering ads to the walls and other peoples' tables, starts making threats to groups of people cordially conversing, and walks out.
Gitflow – Animated in React
Thanks! A command log would be really helpful too.
The HubFlow docs contain GitFlow docs and some really helpful diagrams: https://datasift.github.io/gitflow/IntroducingGitFlow.html
I change the release prefix to 'v' so that the git tags for the release look like 'v0.0.1' and 'v0.1.0':
git config --replace-all gitflow.prefix.versiontag v
git config --replace-all hubflow.prefix.versiontag v
I usually use HubFlow instead of GitFlow because it requires there to be a Pull Request; though GitFlow does work when offline / without access to GitHub.Sure.. will add the command log
Ask HN: How feasible is it to become proficient in several disciplines?
For example to become a professional in:
- back-end api development
- DevOps
- Data Engineer (big data, data science, ML, etc)
It is feasible, though as with any type of specialization, you're then a "jack of all trades, master of none". Maybe a title like "Full Stack Data Engineer" would be descriptive.
You could write an OAuth API for accepting and performing analysis of datasets (model fitting / parameter estimation; classification or prediction), write a test suite, write Kubernetes YAML for a load-balanced geodistributed dev/test/prod architecture, and continuously deploy said application (from branch merges, optionally with a manual confirmation step; e.g. with GitLab CI) and still not be an actual Data Engineer.
After rising for 100 years, electricity demand is flat
Cryptocurrency mining is about to consume more electricity than home usage in Iceland.[1] I assume it's similar in other places that have cheaper electricity.
Seems that power companies should encourage consumers to mine Bitcoin. Problem solved.
[1] https://qz.com/1204840/iceland-will-use-more-electricity-min...
> Seems that power companies should encourage consumers to mine Bitcoin. Problem solved.
Blockchains will likely continue to generate considerable demand for electricity for the foreseeable future.
Blockchain firms can locate where energy is cheapest. Currently that's in countries where energy prices go negative due to excess capacity and insufficient energy storage resources (batteries, [hemp/graphene] supercapacitors, water towers).
With continued demand, energy companies can continue to invest in new clean energy generation alternatives.
Unfortunately, in the current administration's proposed budget, funding for ARPA-E is cancelled and allocated to clean coal; which Canada, France, and the UK are committed to phasing out entirely by ~2030.
A framework for evaluating data scientist competency
Something HTML with local state like "Programmer Competency Matrix" would be great.
http://sijinjoseph.com/programmer-competency-matrix/
Levi Strauss to use lasers instead of people to finish jeans
Chaos Engineering: the history, principles, and practice
awesome-chaos-engineering lists a bunch of chaos engineering resources and tools such as Gremlin: https://github.com/dastergon/awesome-chaos-engineering
Scientists use an atomic clock to measure the height of a mountain
Quantum_clock#More_accurate_experimental_clocks: https://en.wikipedia.org/wiki/Quantum_clock#More_accurate_ex...
> In 2015 JILA evaluated the absolute frequency uncertainty of their latest strontium-87 optical lattice clock at 2.1 × 10−18, which corresponds to a measurable gravitational time dilation for an elevation change of 2 cm (0.79 in) on planet Earth that according to JILA/NIST Fellow Jun Ye is "getting really close to being useful for relativistic geodesy".
AFAIU, this type of geodesy isn't possible with 'normal' time structs. Are nanoseconds enough?
"[Python-Dev] PEP 564: Add new time functions with nanosecond resolution" https://mail.python.org/pipermail/python-dev/2017-October/14...
The thread you linked makes a good point: there isn't really any reason to care about the actual time, only the relative time. These sorts of clocks just use 256- or 512-bit counters; it's not like they're having overflow issues.
Resources to learn project management best practices?
My side project is beginning to attract interest from a few people who would like to hop on board. At this point I am just doing what feels familiar and sensible, but the project manager perspective is new to me. Are there any sort of articles/books/podcasts/etc that could clue me into how to become better at it?
Project Management: https://wrdrd.github.io/docs/consulting/software-development... ... #requirements-traceability, #work-breakdown-structure (Mission, Project, Goal/Objective #n; Issue #n, - [ ] Task)
"Ask HN: How do you, as a developer, set measurable and actionable goals?" https://westurner.github.io/hnlog/#story-15119635
- Burndown Chart, User Stories
... GitHub and GitLab have milestones and reorderable issue boards. I still like https://waffle.io for complexity points; though you can also just create labels for e.g. complexity (Complexity-5) and priority (Priority-5).
Ask HN: Thoughts on a website-embeddable, credential validating service?
Reading Troy Hunt's password release V2 blog post [0], I came across the NIST recommendation to prevent users from creating accounts with passwords discovered in data breaches. This got me thinking: would a website admin (ex. small business owner with a custom website) benefit from a service that validates user passwords? The idea is to create a registration iframe with forms for email, password, etc., which would check hashed credentials against a database of data from breaches. Additionally, client-side validation would enforce rules recommended by the NIST's Digital Identity Guidelines [1], which would relieve admins from implementing their own rules. I'm sure there are additional security features that can be added.
1. Have you seen a need for this type of service, and could you see this being adopted at all?
2. Do you know of a service like this? I've looked, no hits so far.
3. Does the architecture seem sound?
[0]: https://www.troyhunt.com/ive-just-launched-pwned-passwords-version-2/
[1]: https://www.nist.gov/itl/tig/projects/special-publication-800-63
blockchain-certificates/cert-verifier-js: https://github.com/blockchain-certificates/cert-verifier-js
> A library to enable parsing and verifying a Blockcert. This can be used as a node package or in a browser. The browserified script is available as verifier.js.
https://github.com/blockchain-certificates/cert-issuer
> The cert-issuer project issues blockchain certificates by creating a transaction from the issuing institution to the recipient on the Bitcoin blockchain that includes the hash of the certificate itself.
... We could/should also store X.509 cert hashes in a blockchain.
Exactly what part of such a service would benefit from anything related to a blockchain?
Are you asking me why blockcerts stores certs in a blockchain?
Or whether using certs (really long passwords) is a better option than submitting unhashed passwords on a given datetime to a third-party in order to make sure they're not in the pwned passwords tables?
I was just reading about a company trying to make self-sovereign identity including actual certs (like degrees and such) an accessible and widely applicable/acceptable technology using Ethereum blockchain. I thought it showed some real practicality and promise. I believe it begins with U- forgot the name.Perhaps UPort? Anyhow, I'd be interested in hearing from anyone here about why that might be a bad or good idea. I don't personally have the skill in that tech to know.
Known Traveler Digital Identity system is a "new model for airport screening and security that uses biometrics, cryptography and distributed ledger technologies."
Blockcerts are for academic credentials, AFAIU.
[EDIT]
Existing blockchains have a limited TPS (transactions per second) for writes; but not for reads. Sharding and layer-2 (sidechains) do not have the same assurances. I'm sure we all remember how cryptokitties congested the txpool during the Bitcoin futures launch.
Thank you. I looked into "clear", one of the airport known traveler ID systems. (I'm assuming there are others) It's pretty cool/concerning. Takes ~ 8 min to load a traveler into its system. Thanks for reminding me of TPS---the info on read vs write.
Ask HN: What's the best algorithms and data structures online course?
These aren't courses, but from answers to "Ask HN: Recommended course/website/book to learn data structure and algorithms" :
Data Structure: https://en.wikipedia.org/wiki/Data_structure
Algorithm:https://en.wikipedia.org/wiki/Algorithm
Big O notation:https://en.wikipedia.org/wiki/Big_O_notation
Big-O Cheatsheet: http://bigocheatsheet.com
Coding Interview University > Data Structures: https://github.com/jwasham/coding-interview-university/blob/...
OSSU: Open Source Society University > Core CS > Core Theory > "Algorithms: Design and Analysis, Part I" [&2] https://github.com/ossu/computer-science/blob/master/README....
"Algorithms, 4th Edition" (2011; Sedgewick, Wayne): https://algs4.cs.princeton.edu/
Complexity Zoo > Petting Zoo (P, NP,): https://complexityzoo.uwaterloo.ca/Petting_Zoo
While perusing awesome-awesomeness [1], I found awesome-algorithms [2] , algovis [3], and awesome-big-o [4].
[1] https://github.com/bayandin/awesome-awesomeness
[2] https://github.com/tayllan/awesome-algorithms
Using Go as a scripting language in Linux
I, too, didn't realize that shebang parsing is implemented in the `binfmt_script` kernel module.
Does this persist across reboots?
echo ':golang:E::go::/usr/local/bin/gorun:OC' | sudo tee /proc/sys/fs/binfmt_misc/registerNo, but different init systems may autoload formats based on some configuration files. Systemd, for instance: https://www.freedesktop.org/software/systemd/man/systemd-bin...
Guidelines for enquiries regarding the regulatory framework for ICOs [pdf]
This is a helpful table indicating whether a Payment, Utility, Asset, or Hybrid coin/token: is a security, qualifies under Swiss AML payment law.
The "Minimum information requirements for ICO enquiries" appendix seems like a good set of questions for evaluating ICOs. Are there other good questions to ask when considering whether to invest in a Payment, Utility, Asset, or Hybrid ICO?
Are US regulations different from these clear and helpful regulatory guidelines for ICOs in Switzerland?
> Are there other good questions to ask when considering whether to invest in a Payment, Utility, Asset, or Hybrid ICO?
This paper doesn seem to cover that, only on how regulators should treat investments.
On investing in ICOs, the questions are the same as any other IPO. And, in most cases, just speculation.
The Benjamin Franklin method for learning more from programming books
> Read your programming book as normal. When you get to a code sample, read it over
> Then close the book.
> Then try to type it up.
According to a passage in "The Autobiography of Benjamin Franklin" (1791) regarding re-typing from "The Spectator"
https://en.wikipedia.org/wiki/The_Autobiography_of_Benjamin_...
Avoiding blackouts with 100% renewable energy
I notice that cases A and C require batteries for storage.
Should there be a separate entry for new gen supercapacitors? Supercapacitors built with both graphene and hemp have different Max Charge Rate (GW), Max Discharge Rate (GW), and Storage (TWh) capacities than even future-extrapolated batteries and current supercapacitors.
https://en.wikipedia.org/wiki/Supercapacitor
The cost and capabilities stats in this article look very promising:
"Hemp Carbon Makes Supercapacitors Superfast” https://www.asme.org/engineering-topics/articles/energy/hemp...
> “Our device’s electrochemical performance is on par with or better than graphene-based devices,” Mitlin says. “The key advantage is that our electrodes are made from biowaste using a simple process, and therefore, are much cheaper than graphene.”
> Graphene is, however, expensive to manufacture, costing as much as $2,000 per gram. [...] developed a process for converting fibrous hemp waste into a unique graphene-like nanomaterial that outperforms graphene. What’s more, it can be manufactured for less than $500 per ton.
> Hemp fiber waste was pressure-cooked (hydrothermal synthesis) at 180 °C for 24 hours. The resulting carbonized material was treated with potassium hydroxide and then heated to temperatures as high as 800 °C, resulting in the formation of uniquely structured nanosheets. Testing of this material revealed that it discharged 49 kW of power per kg of material—nearly triple what standard commercial electrodes supply, 17 kW/kg.
https://scholar.google.com/scholar?hl=en&q=hemp+supercapacit...
To be clear, supercapacitors are an alternative to li-ion batteries.
"Matching demand with supply at low cost in 139 countries among 20 world regions with 100% intermittent wind, water, and sunlight (WWS) for all purposes" (Renewable Energy, 2018) https://web.stanford.edu/group/efmh/jacobson/Articles/I/Comb...
Ask HN: What are some common abbreviations you use as a developer?
These are called 'codelabels'. They're great for prefix-tagging commit messages, pull requests, and todo lists:
BLD: build
BUG: bug
CLN: cleanup
DOC: documentation
ENH: enhancement
ETC: config
PRF: performance
REF: refactor
RLS: release
SEC: security
TST: test
UBY: usability
DAT: data
SCH: schema
REQ: requirement
REQ: request
ANN: announcement
STORY: user story
EPIC: grouping of user stories
There's a table of these codelabels here: https://wrdrd.github.io/docs/consulting/software-development...
Someday TODO FIXME XXX I'll get around to:
- [ ] DOC: create a separate site/organization for codelabels
- [ ] ENH: a tool for creating/renaming GitHub labels with unique foreground and background colors
YAGNI: Ya' ain't gonna need it
LOL, lulz
DRY: Don't Repeat Yourself
KISS: Keep It Super Simple
MVC: Model-View-Controller
MVT: Model-View-Template
MVVM: Model-View-View-Model
UI: User Interface
UX: User Experience
GUI: Graphical User Interface
CLI: Command Line Interface
CAP: Consistency, Availability, Partition tolerance
DHT: Distributed Hash Table
ETL: Extract, Transform, and Load
ESB: Enterprise Service Bus
MQ: Message Queue
VM: Virtual Machine
LXC: Linux Containers
[D]VCS, RCS: [Distributed] Version/Revision Control System
XP: Extreme Programming
CI: Continuous Integration
CD: Continuous Deployment
TDD: Test-Driven Development
BDD: Behavior-Driven Development
DFS, BFS: Depth/Breadth First Search
CRM: Customer Relationship Management
CMS: Content Management System
LMS: Learning Management System
ERP: Enterprise Resource Planning system
HTTP: Hypertext Transfer Protocol
HTTP STS: HTTP Strict Transport Security
REST: Representational State Transfer
API: Application Programming Interface
HTML: Hypertext Markup Language
DOM: Document Object Model
LD: Linked Data
LOD: Linked Open Data
URI: Uniform Resource Indicator
URN: Uniform Resource Name
URL: Uniform Resource Locator
UUID: Universally Unique Identifier
RDF: Resource Description Format
RDFS: RDF Schema
OWL: Web Ontology Language
JSON-LD: JSON Linked Data
JSON: JavaScript Object Notation
CSVW: CSV on the Web
CSV: Comma Separated Values
CIA: Confidentiality, Integrity, Availability
ACL: Access Control List
RBAC: Role-Based Access Control
MAC: Mandatory Access Control
CWE: Common Weakness Enumeration
CVE: Common Vulnerabilities and Exposures
XSS: Cross-Site Scripting
CSRF: Cross-Site Request Forgery
SQLi: SQL Injection
ORM: Object-Relational Model
AUC: Area Under Curve
ROC: Receiver Operating Characteristic
DL: Description Logic
RL: Reinforcement Learning
CNN: Convolutional Neural Network
DNN: Deep Neural Network
IS: Information Systems
ROI: Return on Investment
RPU: Revenue per User
MAU: Monthly Active Users
DAU: Daily Active Users
STEM: Science, Technology, Engineering, Mathematics/Medicine
STEAM: STEM + Arts
W3C: World-Wide-Web Consortium
GNU: GNU's not Unix
WRDRD: WRD R&D
... The Sphinx ``.. index::`` directive makes it easy to include index entries for acronym forms, too https://wrdrd.github.io/docs/genindex
There Might Be No Way to Live Comfortably Without Also Ruining the Planet
"A good life for all within planetary boundaries" (2018) https://www.nature.com/articles/s41893-018-0021-4
> Abstract: Humanity faces the challenge of how to achieve a high quality of life for over 7 billion people without destabilizing critical planetary processes. Using indicators designed to measure a ‘safe and just’ development space, we quantify the resource use associated with meeting basic human needs, and compare this to downscaled planetary boundaries for over 150 nations. We find that no country meets basic needs for its citizens at a globally sustainable level of resource use. Physical needs such as nutrition, sanitation, access to electricity and the elimination of extreme poverty could likely be met for all people without transgressing planetary boundaries. However, the universal achievement of more qualitative goals (for example, high life satisfaction) would require a level of resource use that is 2–6 times the sustainable level, based on current relationships. Strategies to improve physical and social provisioning systems, with a focus on sufficiency and equity, have the potential to move nations towards sustainability, but the challenge remains substantial.
> "Radical changes are needed if all people are to live well within the limits of the planet," [...]
> "These include moving beyond the pursuit of economic growth in wealthy nations, shifting rapidly from fossil fuels to renewable energy, and significantly reducing inequality.
> "Our physical infrastructure and the way we distribute resources are both part of what we call provisioning systems. If all people are to lead a good life within the planet's limits then these provisioning systems need to be fundamentally restructured to allow for basic needs to be met at a much lower level of resource use."
Perhaps ironically, our developments in service of sustainability (resource efficiency) needs for a civilization on Mars are directly relevant to solving these problems on Earth.
Recycle everything.
Survive without soil, steel, hydrocarbons, animals, oxygen.
Convert CO2, sunlight, H20, and geothermal energy to forms necessary for life.
https://en.wikipedia.org/wiki/Colonization_of_Mars
Algae, carbon capture, carbon sequestration, lab grown plants, water purification, solar power, [...]
Mars requires a geomagnetic field in order to sustain an atmosphere in order to [...].
"The Limits to Growth" (1972, 2004) [1] very clearly forecasts these same unsustainable patterns of resource consumption: 'needs' which exceed and transgress our planetary biophysical boundaries.
The 17 UN Sustainable Development Goals (#GlobalGoals) [2] outline our worthwhile international objectives (Goals, Targets, and Indicators). The Paris Agreement [3] sets targets and asks for commitments from nation states (and businesses) to help achieve these goals most efficiently and most sustainably.
In the US, the Clean Power Plan [4] was intended to redirect our national resources toward renewable energy with far less external costs. Direct and indirect subsidies for nonrenewables are irrational. Are subsidies helpful or necessary to reach production volumes of renewable energy products and services?
There are certainly financial incentives for anyone who chooses to invest in solving for the Global Goals; and everyone can!
[1] https://en.wikipedia.org/wiki/The_Limits_to_Growth
[2] http://www.un.org/sustainabledevelopment/sustainable-develop...
Multiple GWAS finds 187 intelligence genes and role for neurogenesis/myelination
> We found evidence that neurogenesis and myelination—as well as genes expressed in the synapse, and those involved in the regulation of the nervous system—may explain some of the biological differences in intelligence.
re: nurture, hippocampal plasticity and hippocampal neurogenesis also appear to be affected by dancing and omega-3,6 (which are transformed into endocannabinoids by the body): https://news.ycombinator.com/item?id=15109698
Could we solve blockchain scaling with terabyte-sized blocks?
These numbers in a computational model (or even Jupyter notebooks) would be useful.
We may indeed need fractional satoshis ('naks').
With terabyte blocks, lightning network would be unnecessary: at least for TPS.
There will need to be changes to account for quantum computing capabilities somewhere in the future timeline of Bitcoin (and everything else in banking and value-producing industry). Probably maybe a different hash function instead of just a routine difficulty increase (and definitely something other than ECDSA, which isn't a primary cost). $1.3m/400k a year to operate a terabyte mining rig with 50Gbps bandwidth would affect decentralization; though maybe not any more than it already is affected now.
https://en.bitcoin.it/wiki/Weaknesses#Attacker_has_a_lot_of_... (51%)
Confidence intervals for these numbers would be useful.
Casper PoS and beyond may also affect future Bitcoin volume estimates.
Ask HN: Do you have ADD/ADHD? How do you manage it?
Also, how has it affected your CS career? I feel that transitioning to management would help, as it does not require lengthy periods of concentration, but rather distributed attention for shorter periods.
Music. Headphones. Chillstep, progressive, chillout etc. from di.fm. Long mixes from SoundCloud with and without vocals. "Instrumental"
Breathe in through the nose and out through the mouth.
Less sugar and processed foods. Though everyone has a different resting glucose level.
Apparently it's called alpha-pinene.
Fidget things. Rubberband, paperclip.
The Pomodoro Technique: work 25 minutes, chill for 5 (and look at something at least 20 feet away (20-20-20 rule))
Lists. GTD. WBS.
Exercise. Short walks.
Ask HN: How to understand the large codebase of an open-source project?
Hello All!
what are techniques you all used to learn and understand a large codebase? what are the tools you use?
What is the best way to learn to code from absolute scratch?
We have been hosting a Ugandan refugee in our home in Oakland for the past 9 months and he wants to learn how to code.
Where is the best place for him to start from absolute scratch? What resources can we point him to? Who can help?
Here's an answer to a similar question: "Ask HN: How to introduce someone to programming concepts during 12-hour drive?" https://news.ycombinator.com/item?id=15454421
https://learnxinyminutes.com/docs/python3/ (Python3)
https://learnxinyminutes.com/docs/javascript/ (Javascript)
https://learnxinyminutes.com/docs/git/ (Git)
https://learnxinyminutes.com/docs/markdown/ (Markdown)
Read the docs. Read the source. Write docstrings. Write automated tests: that's the other half of the code.
Keep a journal of your knowledge as e.g. Markdown or ReStructuredText; regularly pull the good ones from bookmarks and history into an outline.
I keep a tools reference doc with links to Wikipedia, Homepage, Source, Docs: https://wrdrd.github.io/docs/tools/
And a single-page log of my comments: https://westurner.github.io/hnlog/
> To get a job, "Coding Interview University": https://github.com/jwasham/coding-interview-university
Wow, thank you so much. This is incredibly useful.
Tesla racing series: Electric cars get the green light – Roadshow
Tesla Racing Circuit ideas for increasing power discharge rate, reducing heat, and reducing build weight:
Hemp supercapacitors (similar power density as graphene supercapacitors and li-ion, lower cost than graphene)
Active cooling. Modified passive cooling.
Biocomposite frame and panels (stronger and lighter than steel and aluminum (George Washington Carver))
> Biocomposite frame and panels (stronger and lighter than steel and aluminum (George Washington Carver))
"Soybean Car" (1941) https://en.wikipedia.org/wiki/Soybean_car
What happens if you have too many jupyter notebooks?
These days there is a tendency in data analysis to use Jupyter Notebooks. But what happens if you have too many jupyter notebooks? For example, there are more than a hundred.
Actually, you start creating some modules. However, it is less convenient to work with them compared to what was before. It happens that you should code in web interface, somewhere in similar to the notepad++ form or you should change your IDLE.
Personally, I work in Pycharm and so far I couldn't assess remote interpreter or VCS. It is because pickle files or word2vec weighs too much (3gb+) and so I don't want to download/upload them. Also Jupyter is't cool in pycharm.
Do you have better practices in your companies? How to correctly adjust IDLE? Do you know about any possible substitution for the IPython notebook in the world of data analysis?
> what happens if you have too many jupyter notebooks? For example, there are more than a hundred.
Like anything else, Jupyter Notebook is limited by the CPU and RAM of the system hosting the Tornado server and Jupyter kernels.
At 100 notebooks (or even just one), it may be a good time to factor common routines into a packaged module with tests and documentation.
It's actually possible (though inefficient) to import code from Jupyter notebooks with ipython/ipynb (pypi:ipynb): https://github.com/ipython/ipynb ( https://jupyter-notebook.readthedocs.io/en/stable/examples/N... )
> Actually, you start creating some modules. However, it is less convenient to work with them compared to what was before. It happens that you should code in web interface, somewhere in similar to the notepad++ form or you should change your IDLE.
The Spyder IDE has support for .ipynb notebooks converted to .py (which have the IPython prompt markers in them). Spyder can connect an interpreter prompt to a running IPython/Jupyter kennel. There's also a Spyder plugin for Jupyter Notebook: https://github.com/spyder-ide/spyder-notebook
> Personally, I work in Pycharm and so far I couldn't assess remote interpreter or VCS. It is because pickle files or word2vec weighs too much (3gb+) and so I don't want to download/upload them.
Remote data access times can be made faster by increasing the space efficiency of the storage format, increasing the bandwidth of the connection, moving the data to the code, or moving the code to the data.
> Do you have better practices in your companies?
There are a number of [Reproducible] Data Science cookiecutter templates which have a directory for notebooks, module packaging, and Sphinx docs: https://cookiecutter.readthedocs.io/en/latest/readme.html#da...
Refactoring increases testability and code reuse.
> How to correctly adjust IDLE?
I don't think I understand the question?
"Configuring IPython" https://ipython.readthedocs.io/en/stable/config/index.html
Jupyter > "Installation, Configuration, and Usage" https://jupyter.readthedocs.io/en/latest/projects/content-pr...
> Do you know about any possible substitution for the IPython notebook in the world of data analysis?
From https://en.wikipedia.org/wiki/Notebook_interface :
> > "Examples of the notebook interface include the Mathematica notebook, Maple worksheet, MATLAB notebook, IPython/Jupyter, R Markdown, Apache Zeppelin, Apache Spark Notebook, and the Databricks cloud."
There are lots of Jupyter kernels for different tools and languages (over 100; including for other 'notebook interfaces'): https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
And there are lots of Jupyter integrations and extensions: https://github.com/quobit/awesome-python-in-education/blob/m...
Cancer ‘vaccine’ eliminates tumors in mice
The article is about this study:
"Eradication of spontaneous malignancy by local immunotherapy" http://stm.sciencemag.org/content/10/426/eaan4488
> In situ vaccination with low doses of TLR ligands and anti-OX40 antibodies can cure widespread cancers in preclinical models.
Boosting teeth’s healing ability by mobilizing stem cells in dental pulp
Tideglusib
https://en.wikipedia.org/wiki/Tideglusib
> "Promotion of natural tooth repair by small molecule GSK3 antagonists" https://www.nature.com/articles/srep39654
> [...] Here we describe a novel, biological approach to dentine restoration that stimulates the natural formation of reparative dentine via the mobilisation of resident stem cells in the tooth pulp.
This Biodegradable Paper Donut Could Let Us Reforest the Planet
"These drones can plant 100,000 trees a day" https://news.ycombinator.com/item?id=16260892
Drones that can plant 100k trees a day
> It’s simple maths. We are chopping down about 15 billion trees a year and planting about 9 billion. So there’s a net loss of 6 billion trees a year.
This is a regional thing [0] though. We need to plant the trees in Latin America, Caribbean, and Sub-Saharan Africa. The rest of the world is gaining forests.
[0] http://www.telegraph.co.uk/news/2016/03/23/deforestation-whe...
Now, the question is, which industry sector cuts the largest percentage of trees. Now, answering that question, is there a way to do the same thing without cutting trees?
Answering that question and then executing a business plan is probably worth billions.
Planting trees to combat 6 billion trees lost every year is a pure expense, no profit to be made at all. At least not if you won't cut them down a few decades later.
A very good way to absorb atmospheric carbon is to plant new trees and cut down and use old ones for anything else but burning it.
"This Biodegradable Paper Donut Could Let Us Reforest The Planet" https://news.ycombinator.com/item?id=16261101
I think this kind of thing is funny for us westerners. I recently found this tech has been used in arid countries for possibly hundreds or thousands of years.
https://duckduckgo.com/?q=ollas+gardening&t=lm&iax=images&ia...
I do realize the ollas are for more permanent gardens, but it's the same concept. I do hope the paper version gets used, there are plenty of places that could benefit from it.
What are some YouTube channels to progress into advanced levels of programming?
There are some cool YouTube channel suggestions on https://news.ycombinator.com/item?id=16224165 But I wanted to know which of those are great to progress into advanced level of programming? Which of the channels teach advanced techniques?
These aren't channels, but a number of the links are marked with "(video)":
https://github.com/jwasham/coding-interview-university
https://github.com/jwasham/coding-interview-university/blob/...
Thanks
Multiple issue and pull request templates
+1
Default: /ISSUE_TEMPLATE.md
/ISSUE_TEMPLATE/<name>.md</name>
Default: /PULL_REQUEST_TEMPLATE.md
/PULL_REQUEST_TEMPLATE/<name>.md</name>
You can leave the last S for off (for savings) if you want /ISSUE_TEMPLATE(S)/ or /PULL_REQUEST_TEMPLATE(S)/
Five myths about Bitcoin’s energy use
Regarding the proof-of-stake part: Ethereum devs are also working on sharding which will make scaling on the Ethereum Blockchain way easier. I really think that Ethereum will be the No.1 cryptocurrency in the near future. Bitcoin devs showed plenty of times, that they are not capable of keeping pace with demand (couldn't even get the block size thing right). Bitcoin is virtually dead. No one can really use it for real world transactions. Plus Bitcoin in reality is a really central coin, completely in the hands of the miners. Even if the Bitcoin devs decided to go with PoS, the miners wouldn't agree.
Proof of Work (Bitcoin*, ...), Proof of Stake (Ethereum Casper), Proof of Space, Proof of Research (GridCoin, CureCoin,)
Plasma (Ethereum) and Lightning Network (BitCoin (SHA256), Litecoin (scrypt),) will likely offload a significant amount of transaction volume and thereby reduce the kWh/transaction metrics.
> But electricity costs matter even more to a Bitcoin miner than typical heavy industry. Electricity costs can be 30-70% of their total costs of operation.
> [...] If Bitcoin mining really does begin to consume vast quantities of the global electricity supply it will, it follows, spur massive growth in efficient electricity production—i.e. the green energy revolution. Moore’s Law was partially a story about incredible advances in materials science, but it was also a story about incredible demand for computing that drove those advances and made semiconductor research and development profitable. If you want to see a Moore’s-Law-like revolution in energy, then you should be rooting for, and not against, Bitcoin. The fact is that the Bitcoin network, right now, is providing a $200,000 bounty every 10 minutes (the mining reward) to the person who can find the cheapest energy on the planet.
This is ridiculous. The economy is already incentivized to find cheaper electricity by forces far more powerful than Bitcoin. By this logic, you would defend a craze for trying to boil the sea.
If the market had internalized the external health, environmental, and defense costs of nonrenewable energy, we would already have cheap, plentiful renewable energy. But we don't: the market is failing to optimize for factors other than margin. (New Keynesian economics admits market failure, but not non-rationality.)
So, (speculative_valuation - cost) is the margin. Whereas with a stock in a leveraged high-frequency market with shorting, (shareholder_equity - market_cap) is explainable in terms of the market information that is shared.
So, it's actually (~$200K-(n_kwhrs*cost_kwhr)) for whoever wins the block mining lottery (which is about every 10 minutes and can be anyone who's mining).
But the point about Bitcoin maintaining demand for and while we move to competitive lower cost renewable energy and greater efficiency is good.
What we should hope to see is the blockchain industry directly investing in clean energy capacity development in order to rationally minimize their primary costs and maximize environmental sustainability.
It is kind of missing the point though. If everyone was competing to make electrical single-person planes to help people commute to work it would also increase demand for renewable/electrical energy - but can we agree, that just using electrical cars instead of planes is going to take a lot less electricity overall?
Yes, and then energy prices would decrease due to less demand. Blockchain energy usage maintains demand for energy; which keeps prices high enough that production of renewables can profitably compete with nonrenewables while we reach production volumes of solar, wind, and hemp supercapacitors for grid storage.
> Throughout the first half of 2008, oil regularly reached record high prices.[2][3][4][5] Prices on June 27, 2008, touched $141.71/barrel, for August delivery in the New York Mercantile Exchange [...] The highest recorded price per barrel maximum of $147.02 was reached on July 11, 2008.
At that price, there's more demand for renewables (such as electric vehicles and solar panels)
> Since late 2013 the oil price has fallen below the $100 mark, plummeting below the $50 mark one year later.
https://en.wikipedia.org/wiki/World_oil_market_chronology_fr...
... Energy costs and inflation are highly covariate. (Trouble is, CPI All rarely ever goes back down)
Bitcoin's energy use will tend to raise to match the mining profits. As long as both BTC's price and transaction fees keep raising, the energy miners spend will continue to raise. BTC's price is going up due to speculation, and transaction fees due to blocks being full.
Even at current levels, BTC's built in block rewards dominate tx fees (12.50BTC built in + 1.38 tx fees). The 12.5 built in reward is essentially a subsidy for mining.
Ask HN: Which programming language has the best documentation?
Ask HN: Recommended course/website/book to learn data structure and algorithms
I am a full-time Android developer who does most of his programming work in Java. I am a non CS graduate so didn't study Data structure and algorithms course in university so I am not familiar with this subject which is hindering my prospect of getting better programming jobs. There are so many resources out there on this subject that I am unable to decide which one is the best for my case. Could someone please point me out in the right direction. Thanks.
Data Structure: https://en.wikipedia.org/wiki/Data_structure
Algorithm: https://en.wikipedia.org/wiki/Algorithm
Big O notation: https://en.wikipedia.org/wiki/Big_O_notation
Big-O Cheatsheet: http://bigocheatsheet.com
Coding Interview University > Data Structures: https://github.com/jwasham/coding-interview-university/blob/...
OSSU: Open Source Society University > Core CS > Core Theory > "Algorithms: Design and Analysis, Part I" [&2] https://github.com/ossu/computer-science/blob/master/README....
"Algorithms, 4th Edition" (2011; Sedgewick, Wayne): https://algs4.cs.princeton.edu/
Complexity Zoo > Petting Zoo (P, NP,): https://complexityzoo.uwaterloo.ca/Petting_Zoo
While perusing awesome-awesomeness [1], I found awesome-algorithms [2] , algovis [3], and awesome-big-o [4].
[1] https://github.com/bayandin/awesome-awesomeness
[2] https://github.com/tayllan/awesome-algorithms
Why is quicksort better than other sorting algorithms in practice?
The top-voted response is helpful: throwing away constants as in Big-O notation is misleading, average cases aren't the case; Sedgewick Algorthms book.
ORDO: a modern alternative to X.509
There are a number of W3C specs for this type of thing.
Linked Data Signatures (ld-signatures) relies upon a graph canonicalization algorithm that works with any RDF format (RDF/XML, JSON-LD, Turtle,)
> The signature mechanism can be used across a variety of RDF data syntaxes such as JSON-LD, N-Quads, and TURTLE, without the need to regenerate the signature
https://w3c-dvcg.github.io/ld-signatures/
A defined way to transform ORDO to RDF would be useful for WoT graph applications.
WebID can express X509 certs with the cert ontology. {cert:X509Certificate, cert:PGPCertificate,} rdfs:subClassOf cert:Certificate
https://www.w3.org/ns/auth/cert
https://www.w3.org/2005/Incubator/webid/spec/
ld-signatures is newer than WebID.
(Also, we should put certificates in a blockchain; just like Blockcerts (JSON-LD))
Wine 3.0 Released
Kimbal Musk is leading a $25M mission to fix food in US schools
Spinzero – A Minimal Jupyter Notebook Theme
+1. The Computer Modern serif fonts look legit. Like LaTeX legit.
Now, if we could make the fonts unscalable and put things in two columns (in order to require extra scrolling and 36 character wide almost-compiling copy-and-pasted code samples without syntax highlighting) we'd be almost there!
I searched for Computer Modern fonts and they're all available here: http://canopus.iacp.dvo.ru/~panov/cm-unicode/
I am surprised why these beauties are not widely adopted on websites and such. I agree, they just look very disciplined and professional.
I'd hope someday these relics are hosted on Google Fonts.
What does the publishing industry bring to the Web?
Q: What does the publishing industry bring to the Web?
A: PDF hosting, comments, a community of experts
FWIU, Publishing@W3C proposes WPUB [1] instead of PDF or MHTML for 'publishing' http://schema.org/ScholarlyArticle .
How do WPUB canonical identifiers (which reference/redirect(?) to the latest version of the resource) work with W3C Web Annotations attached to e.g. sentences within a resource identified with a URI? When the document changes, what happens to the attached comments? This is also a problem with PDFs: with a filename like document-20180111-v01.pdf and a stable(!) URL like http://example.org/document-20180111-v01.pdf, we can add Web Annotations to that URI; but with a new URI, those annotations are lost.
Git is a blockchain
Bitcoin is very much inspired by git; though in terms of immutability it's more similar to mercurial and subversion (git push -f)
Git accepts whatever timestamp a node chooses to add to a commit. This can cause interesting sorts in terms of chronological and topological sort orders.
Without an agreed-upon central git server there is not a canonical graph.
You can use GPG signatures with Git, but you need to provide your own keyserver and then there's still no way to enforce permissions (e.g. who can ALTER, UPDATE, or DELETE which files).
Git is a directed acyclic graph (DAG). Not a chain. Blockchains are chains to prevent double-spending (e.g. on a different fork).
Bitcoin was accepted by The Linux Foundation (Linus Torvalds wrote Git): https://lists.linuxfoundation.org/pipermail/bitcoin-dev/
It's always been a mix of inspiration in my mind:
- Torrent P2P file sharing.
- Git like data structure and protocol
- Immutability from functional programming
- Public key cryptography
Show HN: Convert Matlab/NumPy matrices to LaTeX tables
LaTeX must be escaped in order to prevent LaTeX injection.
AFAIU, numpy.savetxt does not escape LaTeX characters?
Jupyter Notebook rich object display protocol checks for obj._repr_latex_() when converting a Jupyter notebook from .ipynb to LaTeX.
The Pandas _repr_latex_() function calls to_latex(escape=True † ). https://github.com/pandas-dev/pandas/blob/master/pandas/core...
†* The default value of escape ️ (and a few other presentational parameters) is determined from the display.latex.escape option: https://pandas.pydata.org/pandas-docs/stable/options.html?hi... *
df = pd.read_csv('filename.csv', ); df.to_latex(escape=True)
Or, with a Jupyter notebook:
df = pd.read_csv('filename.csv', ); df
# $ jupyter convert --to latex filename.ipynb
Wouldn't it be great if there was a LaTeX incantation that allowed for specifying that the referenced dataset URI (maybe optionally displayed also as a table) is a premise of the analysis; with RDFa and/ or JSONLD in addition to LaTeX PDF? That way, an automated analysis tool could identify and at least retrieve the data for rigorous unbiased analyses.
http://schema.org/ScholarlyArticle
#StructuredPremises
A Year of Spaced Repetition Software in the Classroom
What a great article about using Anki during class in a Language Arts curriculum.
NIST Post-Quantum Cryptography Round 1 Submissions
Are there any blogs that talk about what’s required in a crypto algorithm to withstand quantum computing? Is a straightforward increasing the key size enough? Or are there new paradigms explored? Are these algorithms symmetric or asymmetric?
This paper lists a few of the practical concerns for quantum-resistant algos (and proposes an algo that wasn't submitted to NIST Post-Quantum Cryptography Round 1):
"Quantum attacks on Bitcoin, and how to protect against them" https://arxiv.org/abs/1710.10377 (~2027?)
A few Quantum Computing and Quantum Algorithm resources: https://news.ycombinator.com/item?id=16052193
Responsive HTML (arxiv-vanity/engrafo, PLoS,) or Markdown in a Jupyter notebook (stored in a Git repo with a tag and maybe a DOI from figshare or Zenodo) really would be far more useful than comparing LaTeX equations rendered into PDFs.
What are some good resources to learn about Quantum Computing?
Quantum computing: https://en.wikipedia.org/wiki/Quantum_computing
Quantum algorithm: https://en.wikipedia.org/wiki/Quantum_algorithm
Quantum Algorithm Zoo: http://math.nist.gov/quantum/zoo/
Jupyter notebooks:
* QISKit/qiskit-tutorial > "Exploring Quantum Information Concepts" https://nbviewer.jupyter.org/github/QISKit/qiskit-tutorial/b...
* jrjohansson/qutip-lectures > "Lecture 0 - Introduction to QuTiP - The Quantum Toolbox in Python" https://nbviewer.jupyter.org/github/jrjohansson/qutip-lectur...
http://qutip.org/tutorials.html
* sympy/quantum_notebooks https://nbviewer.jupyter.org/github/sympy/quantum_notebooks/...
https://github.com/topics/quantum
krishnakumarsekar/awesome-quantum-machine-learning: https://github.com/krishnakumarsekar/awesome-quantum-machine...
arxiv quant-ph: https://arxiv.org/list/quant-ph/recent
Gridcoin: Rewarding Scientific Distributed Computing
In all honesty this is what I've been waiting for in terms of a useful cryptocurrency. Now if we could only decentralize the control of what projects the processing goes towards with smart contracts then we could have a coin with more actual utility. Imagine the hash rate of the BTC network going towards some useful calculations.
> Imagine the hash rate of the BTC network going towards some useful calculations.
""" CureCoin Reaches #1 Ranking on Folding@home
As of the afternoon of August 29, 2017 (Eastern Time), the Curecoin Team 224497 earned the world's #1 rank on Stanford's Folding@home - a protein folding simulation Distributed Computing Network (DCN). In a little over 3 years, the team (including our merge-folding partners at Foldingcoin) collectively produced 160 billion points worth of molecular computations to support research in the areas of cancer, Alzheimer's, Huntington's, Parkinson's, Infectious Disease as well as helping scientists uncover new molecular dynamics through groundbreaking computational techniques. """
> Imagine the hash rate of the BTC network going towards some useful calculations.
Unfortunatly BTC mining now runs almost entirely on ASICs that can't be used to compute anything but SHA-256.
Imagine and equivalent amount of computer power in an alternative future. I mean, you're right, and this limitation is a flaw in bitcoin, but you've also missed the point.
It's also hard to call it a flaw in Bitcoin, considering it was an intentional design decision.
Even intentional design decisions can be flawed.
There's a pretty hard limit bounding the optimizability of SHA256. That's why hashcash uses a cryptographic hash function.
There may be - or, very likely are - shortcuts for proof of research better than Grover's; which, when found, will also be very useful for science and medicine. However, that advantage is theoretically destabilizing for a distributed consensus network; which is also a strange conflict in incentives.
Sort of like buying "buy gold" commercials when the market was heading into the worst recession since the Great Depression.
SSL accelerators may benefit from the SHA256 ASIC optimizations incentivized by the bitcoin design.
"""The accelerator provides the RSA public-key algorithm, several widely used symmetric-key algorithms, cryptographic hash functions, and a cryptographically secure pseudo-random number generator"""
GPU prices are also lower now; probably due to demand pulling volume. The TPS (transactions per second) rate is doing much better these days.
How would you solve the local daretime problem in order with Git and signatures?
Power Prices Go Negative in Germany
The article doesn’t actually answer the questions it purports to answer. Better source: https://www.cleanenergywire.org/factsheets/why-power-prices-....
Energy markets are artificial markets designed to create various price signals that result in certain incentives on both generation and demand, subject to numerous constraints. One constraint is that demand and supply must balance. The grid can’t store much energy. Oversupply can cause grid frequency to go above 50/60 Hz, threatening grid stability: https://www.e-education.psu.edu/ebf483/node/705. Power prices go negative when there is too much generation capacity online at a given instant, relative to demand. That creates incentives for generators that can shut down (like natural gas) to do so.
Negative power prices are not a good thing for consumers. A negative price in the wholesale electric markets does not mean the electricity is "less than free." Obviously, even wind power or solar always costs positive money to generate in real terms. Instead, it signals a mismatch between generation capacity, storage capacity, and demand. In a grid with adequate storage capacity, negative prices would be extremely rare.
I think you're missing the critical piece of your explanation of why negative prices aren't good for consumers:
The reason that negative power prices look like they would be good for consumers is that it seems like they should lower their monthly bills. But in practice their bill should stay the same, even if there were significant periods during which energy prices were negative. Why? The negative prices don't indicate that the cost of power production is negative for the power company, so the power company is losing money. The power company has to recoup those losses somehow, and they do it by charging more when power prices are positive.
In order to take advantage of negative power prices, a power consumer would have to dynamically increase their power consumption in response to the negative prices. If they have any significant power storage capacity, maybe they could store power and sell it back to the grid when prices go positive, or maybe turn on their mining rig while prices are negative, if they go negative frequently.
I'll just add in some empirical results. German power has risen to be some of the most expensive in Europe since they started the Energiewende. They have very low wholesale prices (driving the coal producers out of business) combined with very high retail prices (might be top 3 in Europe?).
The high retail price appears to be driven by the mechanisms driving the Energiewende ie transition to wind and solar.
https://www.cleanenergywire.org/factsheets/what-german-house...
https://www.ovoenergy.com/guides/energy-guides/average-elect...
The NYT article reads as though Germany is benefiting from the switch to renewables, and that the only problems are that sometimes there's so much power around that they have to pay people to use it.
It mentions high cost of electricity, and that it's due to fees and "renewable investment costs" but then immediately hand-waves that away because "household energy bills have been rising over all anyway."
When combined with your information above (those hand-wavy fees actually account for fully 50% of the costs, with 24% being the renewables surcharge) it would seem that the NYT is being misleading. It seems they want to convey the idea that the renewables are an immediate good thing for everyone (which I do not take issue with politically) while downplaying the significant costs to consumers. Am I missing something?
If I'm not, then this does nothing but contribute to the current view of American media as being intentionally misleading when it suits their interests.
No, I don't think you're missing something here.
The price for electricity is indeed very high in Germany, rising to new heights with the new year, again.
The problem is, unsurprisingly, regulation and policy, i.e. the fees you already mentioned, and the subsidies for industrial usage, etc.
So, yes, it actually seems that the NYT is misleading here.
Let's leave this as an exercise for the reader, to judge if anyone should really be surprised here.
I always think of this old "Trust, but verify" quibble, and that it is actually based on an old Russian proverb. The irony.. :)
> The price for electricity is indeed very high in Germany
It's not only very high, but the second highest in the world, and we are about to take over the first spot [1]
> The problem is, unsurprisingly, regulation and policy, i.e. the fees you already mentioned, and the subsidies for industrial usage, etc.
The problem are the subsidies for all the green energy. It's not only the direct costs but also the costs for grid interventions (turning on/off capacity), paying for renewables even if they are not producing because there is too much energy available, the huge requirements for new energy lanes from north to south, getting rid of nuclear power etc.
Subsidies for industrial usage is often quoted as increasing the costs by interested parties, but it's also a direct need of the Energiewende, because the economic damage would be gigantic. Unless you want to get the energy intensive factories out of your country, you have to factor in those costs.
> So, yes, it actually seems that the NYT is misleading here.
Yes, as does most media - especially in Germany. Those negative prices are no win for any german. Why else is it, that we will soon pay the highest prices for electricity in the world?
[1] http://www.sueddeutsche.de/wirtschaft/energiekosten-in-oecd-...
"Several countries in Europe have experienced negative power prices, including Belgium, Britain, France, the Netherlands and Switzerland."
> Yes, as does most media - especially in Germany. Those negative prices are no win for any german. Why else is it, that we will soon pay the highest prices for electricity in the world?
AFAIU, it's because you're aggressively shaping the energy market in order to reduce health and environmental costs now.
The technical issue here is that batteries are not good enough yet; and [hemp] supercapacitors are not yet at the volume needed to lower the costs. So, maintaining a high price for energy keeps the market competitive for renewables which have positive negative externalities.
Bitcoin is an energy arbitrage
In addition to relocating to where energy is the least expensive, Bitcoin creates incentive for miners to lower the local cost of energy: invest in renewable energy.
Renewable Energy / Clean Energy is now less expensive than alternatives; with continued demand, the margins are at least maintained.
> In addition to relocating to where energy is the least expensive, Bitcoin creates incentive for miners to lower the local cost of energy: invest in renewable energy.
We have lots of direct and effective subsides for nonrenewable energy in the United States. And some for renewables, as well. For example [1] average effective tax rate over all money making companies: 26%
"Coal & Related Energy": 0.69%
"Oil/Gas (integrated)": 8.01%
"Power": 29.22%
"Green and Renewable Energy": 26.42%
[1] "Tax Rates by Sector (US)" (January 2017) http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/...
X-posting here from the article's comments:
The price reflects the confidence investors have in the security's ability to meet or exceed inflation and in the information security of the network.
Volatility adds value for algo traders: say the prices are [1, 101, 51, 101, 51, 201]:
(101-1)+(101-51)+(201-51)=300
(201-1)=200
For the average Joe looking at the vested options they're hodling, though, volatility is unfriendly.
When e.g. algo-traders are willing to buy in when the price starts to fall, they're making liquidity; which some exchanges charge less for.
Enigma Catalyst (Zipline) is one way to backtest and live-trade cryptocurrencies algorithmically.
There are now more than 200k pending Bitcoin transactions
At 20 transactions per second it's a delay of 3 hours. (200000/20/60/60)
"The bitcoin network's theoretical maximum capacity sits between 3.3 to 7 transactions per second."
The OT link does say "Transactions Per Second 22.54".
The solutions for this 3 hour backlog of unconfirmed transactions include: implementing SegWit, increasing the blocksize, and Lightning Network.
I think the link counts incoming transactions. I.e. you can always schedule more, but it doesn't mean they're going to be acted on in a reasonable timeframe.
What ORMs have taught me: just learn SQL (2014)
ORMs:
- Are maintainable by a team. "Oh, because that seemed faster at the time."
- Are unit tested: eventually we end up creating at least structs or objects anyway, and then that needs to be the same everywhere, and then the abstraction is wrong because "everything should just be functional like SQL" until we need to decide what you called "the_initializer2".
- Can make it very easy to create maintainable test fixtures which raise exceptions when the schema has changed but the test data hasn't.
- Prevent SQL injection errors by consistently parametrizing queries and appropriately quoting for the target SQL dialect. (One of the Top 25 most frequent vulnerabilities). This is especially important because most apps GRANT both UPDATE and DELETE; if not CREATE TABLE and DROP TABLE to the sole app account.
- Make it much easier to port to a new database; or run tests with SQLite. With raw SQL, you need the table schema in your head and either comprehensive test coverage or to review every single query (and the whole function preceding db.execute(str, *params))
- May be the performance bottleneck for certain queries; which you can identify with code profiling and selectively rewrite by hand if adding an index and hinting a join or lazifying a relation aren't feasible with the non-SQLAlchemy ORM that you must use.
- Should provide a way to generate the query at dev or compile-time.
- Should make it easy to DESCRIBE the query plans that code profiling indicates are worth hand-optimizing (learning SQL is sometimes not the same as learning how a particular database plans a query over tables without indexes)
- Make managing db migrations pretty easy.
- SQLAlchemy really is great. SQLAlchemy has eager loading to solve the N+1 query problem. Django is often more than adequate; and has had prefetch_related() to solve the N+1 query problem since 1.4. Both have an easy way to execute raw queries (that all need to be reviewed for migrations). Both are much better at paging without allocating a ton of RAM for objects and object attributes that are irrelevant now.
- Make denormalizing things from a transactional database with referential integrity into JSON really easy; which webapps and APIs very often need to do.
Is there a good JS ORM? Maybe in TypeScript?
I've used http://bookshelfjs.org/ but I'd stay away from it, only felt cumbersome. No productivity gain.
It's built on top a query builder Knex (http://knexjs.org/) which is decent.
Objection.js and the Knex query builder are excellent. Think ORM light with full access to SQL.
Show HN: An educational blockchain implementation in Python
> It is NOT secure neither a real blockchain and you should NOT use this for anything else than educational purposes.
It would be nice if non-secure parts of implementation or design were clearly marked.
What's the point of education article, if bad examples aren't clearly marked as bad? If MD5 usage is the only issue, author could easily replace it with SHA and get rid of the warning at the start. If there are other issues, how can a reader know which parts to trust?
Even if fixing bad/insecure parts are "left as an exercise for the reader", learning value of the article would be much greater if those parts would be at least pointed at.
OP here.
erikb is spot on in the sibling comment. This hasn't been expert-reviewed, hasn't been audited so I'm pretty confident there is a bug somewhere that I don't know about.
It's educational in the sense that I tried as best a I could to implement the various algorithmic parts (mining, validating blocks & transactions, etc...).
I originally used MD5 because I thought I would do more exploration regarding difficulty and MD5 is faster to compute than SHA. In the end, I didn't do that exploration, so I could easily replace MD5 with SHA. I'll update the notebook to use SHA, but I'm still not gonna remove the warning :)
I'll also try to point out more explicitly which parts I think are not secure.
> I'll also try to point out more explicitly which parts I think are not secure.
Things I've noticed:
* Use of floating point arithmetic.
* Non-reproducible serialization in verify_transaction can produce slightly different, but equivalent JSON, which leads to rejecting transactions if produced JSON is platform-dependent (e.g. CRLFs, spaces vs tabs).
* Miners can perform DoS by creating a pair of blocks referencing each other (recursive call in verify_block is made before any sanity checks or hash checks, so they can modify block's ancestor without worrying about changing its hash).
* mine method can loop forever due to integer overflow.
* Miners can put in block a transaction with output sum greater than input sum - only place where it is checked is in compute_fee and no path from verify_block leads there.
Those are all very good points I didn't think about, thanks for these.
I'll fix the two bugs with verify_block and the possibility for a miner to inject invalid a output > input transaction.
I'll add a note for the 3 others.
For deterministic serialization (~canonicalization), you can use sort_keys=True or serialize OrderedDicts. For deseialization, you'd need object_pairs_hook=collections.OrderedDict.
Most current blockchains sign a binary representation with fixed length fields. In terms of JSON, JSON-LD is for graphs and it can be canonicalized. Blockcerts and Chainpoint are JSON-LD specs:
> Blockcerts uses the Verifiable Claims MerkleProof2017 signature format, which is based on Chainpoint 2.0.
https://github.com/blockchain-certificates/cert-verifier-js/...
FYI, dicts are now ordered by default as of Python 3.6.
That's an implementation detail, and shouldn't be relied upon. If you want an ordered dictionary, you should use collections.OrderedDict.
It's now the spec for 3.6+.
> #python news: @gvanrossum just pronounced that dicts are now guaranteed to retain insertion order. This is the end of a long journey.
https://twitter.com/raymondh/status/941709626545864704
More here: https://www.reddit.com/r/Python/comments/7jyluw/dict_knownor...
OrderedDicts are backwards-compatible and are guaranteed to maintain order after deletion.
Thanks! Simplest explanation I've seen.
Here's an nbviewer link (which, like base58, works on/over a phone): https://nbviewer.jupyter.org/github/julienr/ipynb_playground...
Note that Bitcoin does two rounds of SHA256 rather than one round of MD5. There's also a "P2P DHT" (peer-to-peer distributed hash table) for storing and retrieving blocks from the blockchain; instead of traditional database multi-master replication and secured offline backups.
> ERROR:root:Invalid transaction signature, trying to spend someone else's money ?
This could be more specific. Where would these types of error messages log to?
My mistake, it's BitTorrent that has a DHT. Instead of finding the most network local peer with the block identified by a (prev_hash, hash) hash table key, the Bitcoin blockchain broadcasts all messages to all nodes; which must each maintain a complete backup of the entire blockchain.
"Protocol documentation" https://en.bitcoin.it/wiki/Protocol_documentation
MSU Scholars Find $21T in Unauthorized Government Spending
Unauthorized federal spending (in these two departments) 1998-2015: $21T
Federal debt (2017): $20T
$ 20,000,000,000,000 USD
Would a blockchain for government expenditures help avoid this type of error?
We already now have https://usaspending.gov ( https://beta.usaspending.gov ) and expenditure line item metadata.
Would having traceable money in a distributed ledger help us keep track of money collected from taxpayers?
Obviously, the volatility of most cryptocurrencies would be disadvantageous for purposes of transferring and accounting for government spending. Isn't there a way to peg a cryptocurrency to the USD; even with Quantitative Easing? How is Quantitative Easing different from just deciding to print trillions more 'coins' in order to counter debt or inflation or deflation; why is the government in debt at all?
re: Quantitative Easing
https://en.wikipedia.org/wiki/Quantitative_easing
Say I have $100 in my Social Security Fund (in very non-aggressive investments which need to meet or exceed inflation) and the total supply of money (including paper notes and numbers in debit and credit columns of various public and private databases) the total supply of money is $1T with $1T in debt; if 1T is printed to pay for that debt, is my $100 in retirement savings then worth $50? Or is it more complex than that?
[deleted]
Universities spend millions on accessing results of publicly funded research
Are there good open source solutions for journal publishing? (HTML abstract, PDFs, comments, ...)?
Yes- quite a few beyond what's already been listed, in fact:
http://www.theoj.org/ https://plos.github.io/ambraproject/index.html https://v4.pubpub.org/
Ambra is being discontinued! http://blogs.plos.org/plos/2017/12/ceo-letter-to-the-communi...
Edit: And theoj doesn't really appear to be maintained anymore either...
> Ambra is being discontinued!
The article mentions the discontinuation of Aperta but nothing about Ambra?
https://plos.github.io/ambraproject/Developer-Overview.html
I'm sorry, I just realised that as well - my mistake.
An Interactive Introduction to Quantum Computing
Part 2 mentions two quantum algorithms that could be used to break Bitcoin (and SSH and SSL/TLS; and most modern cryptographic security systems): Shor's algorithm for factorization and Grover's search algorithm.
Part 2: http://davidbkemp.github.io/QuantumComputingArticle/part2.ht...
Shor's algorithm: https://en.wikipedia.org/wiki/Shor%27s_algorithm
Grover's algorithm: https://en.wikipedia.org/wiki/Grover%27s_algorithm
I don't know what heading I'd suggest for something about how concentration of quantum capabilities will create dangerous asymmetry. (That is why we need post-quantum ("quantum resistant") hash, signature, and encryption algorithms in the near future.)
Quantum attacks on Bitcoin, and how to protect against them (ECDSA, SHA256)
"Quantum attacks on Bitcoin, and how to protect against them (ECDSA, SHA256)" https://www.arxiv-vanity.com/papers/1710.10377/
> […] On the other hand, the elliptic curve signature scheme used by Bitcoin is much more at risk, and could be completely broken by a quantum computer as early as 2027, by the most optimistic estimates.
From https://csrc.nist.gov/Projects/Post-Quantum-Cryptography :
> NIST has initiated a process to solicit, evaluate, and standardize one or more quantum-resistant public-key cryptographic algorithms. Nominations for post-quantum candidate algorithms may now be submitted, up until the final deadline of November 30, 2017.
Project Euler
After hearing about it for years, I decided to start working through Project Euler about two weeks ago. It really is much more about math than programming, although it's a lot of fun to take on the problems with a language that has tail call optimization because so many of the problems involve recurrence relations.
I like that the problems are constructed in a way that usually punishes you for trying to use brute force. Sometimes there's a problem that doesn't have a more elegant solution, though, as if to remind us that brute force often works remarkably well.
I agree with Project Euler being mostly about math. I prefer Codewars for practicing programming or learning a new language.
Euler was a mathematician after all. Now thinking about it I wonder what project Djikstra would look like, or say maybe project Stallman.
There's a Project Rosalind (named after Rosalind Franklin) which is sort of like Project Euler for bioinformatics: http://rosalind.info/about/
I like https://rosalind.info bioinformatics problems because:
- There are problem explanations and an accompanying textbook.
- You can structure the solutions with unit tests that test for known good values.
- There's a graph of problems.
Who’s Afraid of Bitcoin? The Futures Traders Going Short
Wall Street being able to buy Bitcoin might increase demand, but they might also do things that would at least at times amplify sell pressure like panic selling, shorting it, leveraged shorting, margin calls
Statement on Cryptocurrencies and Initial Coin Offerings
A little emphasis, focusing on the aftermath from the SEC's July report on the DAO:
> "Following the issuance of the 21(a) Report, certain market professionals have attempted to highlight utility characteristics of their proposed initial coin offerings in an effort to claim that their proposed tokens or coins are not securities. Many of these assertions appear to elevate form over substance. Merely calling a token a “utility” token or structuring it to provide some utility does not prevent the token from being a security."
I think people are viewing this as an attack on crypto, when its actually just common sense. People put too much faith the 'Contract' half of 'Ethereum/Smart Contract'
Basically. Today ICOs are selling tokens as shares of equity in their company, or similar. Which you can then sell on.
The problem is these companies essentially reserve the right to disregard that contract and could then sell their company, domestically or overseas, for cash, without recompensating any token holders.
Securities regulation and law stops that. But the tokens do need to be lawful securities in order for the court to recognize them.
Otherwise how can the court help you, and who else is going to help you?
> I think people are viewing this as an attack on crypto, when its actually just common sense.
> […] The problem is these companies essentially reserve the right to disregard that contract and could then sell their company, domestically or overseas, for cash, without recompensating any token holders.
> Securities regulation and law stops that. But the tokens do need to be lawful securities in order for the court to recognize them.
This. IRS regards coins and tokens as capital gains taxable things regardless of whether they qualify as securities. SEC exists to protect investors from scams and unfair dealing. In order to protect investors, SEC regulates issuance of securities.
Ask HN: How do you stay focused while programming/working?
I often find myself "needing" to take a mini-break after just a few minutes of concerted effort while coding. In particular, this often occurs after I've made a tiny breakthrough, prompting me to reward myself by checking Twitter or HN. This bad habit quickly derails any momentum. What are some tips to increase focus stamina and avoid distraction?
It's not exactly new and exciting, but I found that listening to calm, instrumental music helps me focus. Mostly Ambient. If you do not like electronic music, Stars Of The Lid or Bohren & Der Club Of Gore are very much worth checking out.
Also, https://mynoise.net/ has worked wonders for me.
In both cases, it seems that unstructured audio input, like, occupies the parts of my mind that would otherwise distract me.
> It's not exactly new and exciting, but I found that listening to calm, instrumental music helps me focus. Mostly Ambient.
Same. Lounge, Ambient, Chillout, Chillstep (https://di.fm has a bunch of great streams. SoundCloud and MixCloud have complete replayable sets, too.)
I've heard that videogame soundtracks are designed to not be distracting; to help focus.
A Hacker Writes a Children's Book
The rhymes and illustrations look great! Is there a board book edition?
Other great STEM and computers books for kids:
"A is for Array"
"Lift-the-Flap Computers and Coding"
"Computational Fairy Tales"
"Hello Ruby: Adventures in Coding"
"Python for Kids: A Playful Introduction To Programming"
"Lauren Ipsum: A Story About Computer Science and Other Improbable Things"
"Rosie Revere, Engineer"
"Ada Byron Lovelace and the Thinking Machine"
"HTML for Babies: Volume 1 of Web Design for Babies"
"What Do You Do With a Problem?"
"What Do You Do With an Idea?"
"ABCs of Mathematics", "The Pythagorean Theorem for Babies", "Non-Euclidian Geometry for Babies", "Introductory Calculus for Infants", "ABCs of Physics", "Statistical Physics for Babies", "Netwonian Physics for Babies", "Optical Physics for Babies", "General Relativity for Babies", "Quantum Physics for Babies", "Quantum Information for Babies", "Quantum Entanglement for Babies"
"ELI5": "Explain like I'm five"
Someone should really make a list of these.
Ask HN: Do ISPs have a legal obligation to not sell minors' web history anymore?
I guess COPPA is still in place and in theory it applies to ISPs, although they may be allowed to assume that all traffic from a household comes from the bill payer who is presumably over 13.
http://kellywarnerlaw.com/childrens-online-privacy-protectio...
https://www.theverge.com/2017/3/31/15138526/isp-privacy-bill...
Tech luminaries call net neutrality vote an 'imminent threat'
> “The current technically-incorrect order discards decades of careful work by FCC chairs from both parties, who understood the threats that Internet access providers could pose to open markets on the Internet.”
Paid prioritization is that threat.
Again, streaming video content for all ages is not more important than online courses.
Ask HN: Can hashes be replaced with optimization problems in blockchain?
CureCoin.
From https://curecoin.net/knowledge-base/about-curecoin/what-is-c... :
> Curecoin allows owners of both ASIC and GPU/CPU hardware to earn. Curecoin puts ASICs to work at what they are good at–securing a blockchain, while it puts GPUs and CPUs to work with work items that can only be done on them–protein folding. While still having a secure blockchain, it supports, and thus is supported by, scientific research.
...
From "CureCoin Reaches #1 Ranking on Folding@home" https://www.newswire.com/news/bio-research-loves-curecoin-ga... :
> As of the afternoon of August 29, 2017 (Eastern Time), the Curecoin Team 224497 earned the world's #1 rank on Stanford's Folding@home - a protein folding simulation Distributed Computing Network (DCN). In a little over 3 years, the team (including our merge-folding partners at Foldingcoin) collectively produced 160 billion points worth of molecular computations to support research in the areas of cancer, Alzheimer's, Huntington's, Parkinson's, Infectious Disease as well as helping scientists uncover new molecular dynamics through groundbreaking computational techniques.
From https://news.ycombinator.com/item?id=15843795 :
> Gridcoin (Berkeley 2013) is built on Proof-of-Stake and Proof-of-Research. Gridcoin is used as payment for computing resources contributed to BOINC.
> I doubt that volatility would be welcome on the Gridcoin blockchain: Wikipedia lists "6.5% Inflation. 1.5% Interest + 5% Research Payments APR" under the Supply Growth infobox attribute.
Ask HN: What could we do with all the mining power of Bitcoin? Fold Protein?
Instead of buzzing SHA-512 in circles like busy bees ad infinitum, is there any way we can use these calculations productively?
Instead of algo-trading the stock markets?!
There are a number of distributed computing projects (e.g. SETI@home): https://en.wikipedia.org/wiki/List_of_distributed_computing_...
The Ethereum White Paper lists a number of applications for blockchains: https://github.com/ethereum/wiki/wiki/White-Paper
(BitCoin is built on SHA-256, Ethereum is built on Keccak-256 (~SHA-3))
Proof-of-Stake is a lower energy alternative to Proof-of-Work with tradeoffs: https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ
Unfortunately, IDK of another way to find secure consensus (blockchains are consensus protocols) in a DDOS-resistant way with unsolved problems?
> Unfortunately, IDK of another way to find secure consensus (blockchains are consensus protocols) in a DDOS-resistant way with unsolved problems?
Gridcoin (Berkeley 2013) is built on Proof-of-Stake and Proof-of-Research. Gridcoin is used as payment for computing resources contributed to BOINC.
I doubt that volatility would be welcome on the Gridcoin blockchain: Wikipedia lists "Supply growth 6.5% Inflation. 1.5% Interest + 5% Research Payments APR" under the Supply Growth infobox attribute.
No CEO needed: These blockchain platforms will let ‘the crowd’ run startups
How much energy does Bitcoin mining really use?
The Actual FCC Net Neutrality Repeal Document. TLDR: Read Pages 82-87 [pdf]
Here are some links to the relevant antitrust laws:
Sherman Antitrust Act (1890) https://en.wikipedia.org/wiki/Sherman_Antitrust_Act
Aspen Skiing Co. v. Aspen Highlands Skiing Corp. (1985) https://en.wikipedia.org/wiki/Aspen_Skiing_Co._v._Aspen_High....
Transparency in network management and paid prioritization practices and agreements will be relevant.
"We find that antitrust law, in combination with the transparency rule we adopt, is particularly well-suited to addressing any potential or actual anticompetitive harms that may arise from paid prioritization arrangements." (p.147)
If antitrust law is sufficient, as you've found, there would be no need for Title II Common Carrier regulation in any industry.
We can call phone numbers provided by any company at the same rate because phone companies are regulated as Title II Common Carriers. ISPs are also common carriers.
"Public airlines, railroads, bus lines, taxicab companies, phone companies, internet service providers,[3] cruise ships, motor carriers (i.e., canal operating companies, trucking companies), and other freight companies generally operate as common carriers."
The 5 most ridiculous things the FCC says in its new net neutrality propaganda
> The Federal Communications Commission put out a final proposal last week to end net neutrality. The proposal opens the door for internet service providers to create fast and slow lanes, to block websites, and to prioritize their own content. This isn’t speculation. It’s all there in the text.
Great. Payola. Thanks Verizon!
Does the FTC have the agreement information needed to hear the anti-trust cases that are sure to result from what are now complaints to the FCC (an organization with network management expertise) being redirected to the FTC?
Title II is the appropriate policy set for ISPs; regardless of how lucrative horizontal integration with content producers seems.
FCC's Pai, addressing net neutrality rules, calls Twitter biased
No. Censoring hate speech by banning people who are verbally assaulting others (in violation of Terms of Service that they agreed to) is a very different concern than requiring common carriers to equally prioritize bits.
If we extend "you must allow people to verbally assault others (because free speech applies to the government)" to TV and radio, what do we end up with?
Note that the FCC fines non-cable TV (broadcast radio and TV) for cursing on air. See "Obscene, Indecent and Profane Broadcasts" https://www.fcc.gov/consumers/guides/obscene-indecent-and-pr...
How can you ask social media companies to do something about fake news (the vast majority of which served to elect the current administration (which nominated this FCC chairman)) while also lambasting them for upholding their commitment to providing a hate-free experience for net citizens and paying advertisers?
"Open Internet": No blocking. No throttling. No paid prioritization.
It would be easier for us to understand the "Open Internet" rules if the proposed "Restoring Internet Freedom" page wasn't crudely pasted over (redirected to from) the page describing the current Open Internet rules. www.fcc.gov/general/open-internet (current policy) now redirects to www.fcc.gov/restoring-internet-freedom (proposed policy).
ISPs blocking, throttling, or paid-prioritizing Twitter, Netflix, Fox, or CNN for everyone is a different concern than responding to individuals who are threatening others with hate speech.
The current policy ("Open Internet") means that you can use the bandwidth cap that you pay for for whatever legal content you please.
The proposed policy ("Restoring Internet Freedom") means that internet businesses will need to pay every ISP in order to not be slower than the big guys who can afford to pay-to-play (~"payola"). https://en.wikipedia.org/wiki/Payola
A curated list of Chaos Engineering resources
Never having heard off 'Chaos Engineering', this seems like a bad case of 'Cargo Cult Engineering'.
That starts with the term 'chaos', which has a well-defined meaning in Chaos Theory, where it is quite obviously borrowed from: small changes in input lead to large changes in output. Neither distributed systems in general, and especially not the sort of system this engineering strives to build, fit that definition. In fact, they are the exact opposite: every part of a typical web stack is already build to mitigate changing demands such as traffic peaks or attacks.
The mumbo jumbo around "defining a steady state" and "disproving the null hypothesis" seems like a veneer of sciency on a rather well-known concept: testing.
A supreme court justice once said: "Good writing is a $10 thought in a 5 cent sentence". This is the opposite.
"Resilience Engineering" would be a good alternative term for these failure scenario simulations and analyses.
Glossary of Systems Theory > A > Adaptive capacity:
> Adaptive capacity: An important part of the resilience of systems in the face of a perturbation, helping to minimise loss of function in individual human, and collective social and biological systems
Technology behind Bitcoin could aid science, report says
Bloom is working on non-academic credit building and scoring.
Hyperledger brings together many great projects and tools which have numerous applications in science and industry.
Is a blockchain necessary? Could we instead just sign JSONLD records with ld-signatures and store them in an eventually or strongly consistent database we all contribute resources to synchronizing and securing?
That's just the centralization or decentralization question.
We can do it all centralized already but we would also all need to trust whoever is hosting this data and trust every single person who has the ability to enter the data.
Less nodes you need to trust the better, in a centralized solution where everyone can contribute you need to be able to trust everyone.
In a decentralized system where everyone can contribute, you don't need to trust anyone but give up benefits of centralization such as speed, performance and usability.
> We can do it all centralized already but we would also all need to trust whoever is hosting this data and trust every single person who has the ability to enter the data.
There are plenty of ways to minimize trust required with traditional cryptography though, this is not all or nothing, we have been doing this since PGP. You can get the overwhelming majority of the benefits with none of the drawbacks.
But how else are you going to hype an ICO with claims about the size of a market and get people who don't understand how blockchains work to give your their BTC/ETH?
Git hash function transition plan
> Some hashes under consideration are SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
Not sure what K12 is (Keccak?), but BLAKE2 is a very attractive option.
Vintage Cray Supercomputer Rolls Up to Auction
Google is officially 100% sun and wind powered – 3.0 gigawatts worth
+1000.
TIL this is called "Corporate Renewable Energy Procurement". https://www.google.com/search?q=Corporate+Renewable+Energy+P...
PPA: Power Purchase Agreement https://en.wikipedia.org/wiki/Power_purchase_agreement
Interactive workflows for C++ with Jupyter
QuantStack/xeus-cling https://github.com/QuantStack/xeus-cling
QuantStack/xwidgets https://github.com/QuantStack/xwidgets
QuantStack/xplot (bqplot) https://github.com/QuantStack/xplot
Vanguard Founder Jack Bogle Says ‘Avoid Bitcoin Like the Plague’
Over the past 7 years, Bitcoin has outperformed every security and portfolio that Jack Bogle has recommended.
This is pretty disrespectful to Jack Bogle.
Vanguard is almost singlehandedly responsible for returning trillions of dollars of costs, in the form of fees and underperformance by active managers, back to investors. Millions of investors have benefited.
Bitcoin has been a bubble since $1 and $100 to these people.
What evidence is there that it isn't a bubble? People buy Bitcoin only because they think they can sell it higher. Eventually, you will run out of greater fools.
Nasdaq Plans to Introduce Bitcoin Futures
My guess is that this is probably pretty meaningless.
There are a few things going against them.
- The CBOE and CME are both much larger futures exchanges and are going to be offering futures first
- since you can't net out futures contracts from different exchanges this means they tend to become winner take all
> One way Nasdaq seeks to differentiate itself seems to be in the amount of data it uses for pricing the digital currency contracts. VanEck Associates Corp., which recently withdrew plans for a bitcoin exchange-traded fund, will supply the data used to price the contracts, pulling figures from more than 50 sources, according to the person.
This might be interesting as one of the things that everyone is worried about is price manipulation.
If you haven't thought about how futures work with respect to margin and marking at the end of the trading day you need to know that you can be required to deposit more money into your margin account if the futures trade moves against you on any given day.
This means the marking price is very important and lost of institutional money is worried that the exchanges are easy to manipulate.
see: http://openmarkets.cmegroup.com/3785/understanding-margin-ch...
> Nasdaq’s product will reinvest proceeds from the spin-off back into the original bitcoin in a way meant to make the process more seamless for traders, the person said.
This is awesome,, right now the CBOE and CME both have punted on the question of forks saying, they'll have a best efforts to figure it out.
> One way Nasdaq seeks to differentiate itself seems to be in the amount of data it uses for pricing the digital currency contracts. VanEck Associates Corp., which recently withdrew plans for a bitcoin exchange-traded fund, will supply the data used to price the contracts, pulling figures from more than 50 sources, according to the person. That appears to exceed CME’s plan to use four sources, and Cboe’s one. Nasdaq’s contracts will be cleared by Options Clearing Corp., the person said.
BitMEX bitcoin futures are already online. IDK how many price sources they pull?
Aren't there a few other companies already selling Bitcoin futures?
In general, when the CME enters a market for futures, they take all of the air out of the room. I don't think it's realistic to believe NASDAQ can compete with them.
https://twitter.com/officialmcafee/status/935900326007328768
Well John McAfee thinks bitcoin will hit 1 million by 2020.
Pump and dumpers gonna pump and dump. https://medium.com/@DEFCON_2015/why-mgt-was-delisted-from-ny...
Or, large investment banking houses will step in and create naked shorting opportunities to inflate sell pressure creating 'death spirals' to drive prices down and scoop them up and extreme discounts. This happens in the traditional public markets everyday.
> Or, large investment banking houses will step in and create naked shorting opportunities to inflate sell pressure creating 'death spirals' to drive prices down and scoop them up and extreme discounts. This happens in the traditional public markets everyday.
Is there a term for this?
Yes, this can happen in a few different ways and is the reason why Ycombinator created SAFEs. When you have a public company you will get offers for what are called "credit lines", "debt financing" or "convertible notes". They are traditionally used to create death spirals https://www.investopedia.com/ask/answers/06/deathspiralbond.... as the size of your float increases by you, executive director (CEO/CFO), as a public company "issuing" more stock to cover the loan. The more you issue, the less you're worth until somebody comes along scoops you up and re-engineers the cap table which is a restructuring. However, manipulation can occur within institutions as well: https://news.ycombinator.com/threads?id=KasianFranks&next=14...
Ask HN: Where do you think Bitcoin will be by 2020?
I have a friend who believes it will be $100,000 per BitCoin and his reasoning is 'supply and demand'.
There will be around 18M bitcoins in 2020. [1][2]
[1] https://en.bitcoin.it/wiki/Controlled_supply
This paper [3] suggests we'll be needing to upgrade to quantum-secure hash functions instead of ECDSA before 2027.
[3] "Quantum attacks on Bitcoin, and how to protect against them" https://arxiv.org/abs/1710.10377
Hopefully, Ethereum will have figured out a Proof of Stake [4] solution for distributed consensus which is as resistant to DDOS as Proof of Work; but with less energy consumption (thereby, unfortunately or fortunately, un-incentivizing clean energy as a primary business goal).
[4] https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ
Ask HN: Why would anyone share trading algorithms and compare by performance?
I was speaking with a person years my senior awhile back, and sharing information about the Quantopian platform (which allows users to backtest and share trading algorithms); and he asked me "why would anyone share their trading algorithms [if they're making any money]?"
I tried "to help each other improve their performance". Is there a better way to explain to someone who spends their time reading forums with no objective performance comparisons over historical data why people would help each other improve their algorithmic trading algorithms?
Catalyst, like Quantopian, is also built on top of Zipline; but for cryptocurrencies. https://enigmampc.github.io/catalyst/example-algos.html
Zipline (backtesting and live trading of algorithms with initialize(context) and handle_data(context, data) functions; with the SPY S&P 500 ETF as a benchmark) https://github.com/quantopian/zipline
Pyfolio (for objectively comparing the performance of trading strategies over time) https://github.com/quantopian/pyfolio
...
"Community Algorithms Migrated to Quantopian 2" https://www.quantopian.com/posts/community-algorithms-migrat...
- "Reply to minimum variance w/ contrast" seems to far outperform the S&P 500.
Ask HN: CS papers for software architecture and design?
Can you please point me to some papers that you consider very influential for your work or that you believe they played significant role on how we structure our software nowdays?
"The Architecture of Open Source Applications" Volumes I & II http://aosabook.org/en/
"Manifesto for Agile Software Development" https://en.wikipedia.org/wiki/Agile_software_development#The...
"Catalog of Patterns of Enterprise Application Architecture" https://martinfowler.com/eaaCatalog/
Fowler > Publications ("Refactoring ",) https://en.wikipedia.org/wiki/Martin_Fowler#Publications
"Design Patterns: Elements of Reusable Object-Oriented Software" (GoF book) https://en.wikipedia.org/wiki/Design_Patterns
.
UNIX Philosophy https://en.wikipedia.org/wiki/Unix_philosophy
Plan 9 https://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs
## Distributed Systems
CORBA > Problems and Criticism (monolithic standards, oversimplification,): https://en.wikipedia.org/wiki/Common_Object_Request_Broker_A...
Bulk Synchronous Parallel: https://en.wikipedia.org/wiki/Bulk_synchronous_parallel
Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science)
Raft: https://en.wikipedia.org/wiki/Raft_(computer_science) #Safety
CAP theorem: https://en.wikipedia.org/wiki/CAP_theorem
Keeping a Lab Notebook [pdf]
I'd love to hear some thoughts about keeping a "lab notebook" for ML experiments. I use Jupyter Notebooks when playing around with different ML models, and I find that it really helps to document my thought process with notes and comments. It also seems that the ML workflow is very 'experiment' driven. I'm always thinking "Hm, I think if I tweak this hyperparameter this way, or adjust this layer this way, then I'll get a better result because X". Thus, I have a bit of a hypothesis and proposed experiment. I run that model, and see if it improved or not.
Then, I run into an issue where I can either: 1. overwrite the original model with my new hyperparameters/design and re-run and analyze or 2. keep adding to the same notebook "page" with a new hypothesis/test/analysis loop, thus making the notebook pretty large. With number 1, I often want to backtrack and re-reference how a previous experiment went, but I lose that history. With number 2, it seems to get big pretty quickly, and coming back to the same notebook requires more setup, and "searching" the history gets more cumbersome.
Does anyone try using a separate notebook page for each experiment, maybe with a timestamp or "version"? Or is there a better way to do this in a single notebook? I am thinking that something like "chapters" could help me here, and it seems like this extension might help me: https://github.com/minrk/ipython_extensions#table-of-content...
These are ASCII-sortable:
0001_Introduction.ipynb
0010_Chapter-1.ipynb
ISO8601 w/ UTC is also ASCII sortable.
# Jupyter notebooks as lab notebooks
## Disadvantages
### Mutability
With a lab notebook, you can cross things out but they're still there.
- [ ] ENH: Copy cell and mark as don't execute (or wrap with ```language\n``` and change the cell type to markdown)
- [ ] ENH: add a 'Save and {git,} Commit' shortcut
CoCalc (was: SageMathCloud) has (somewhat?) complete notebook replay with a time slider; and multi-user collaborative editing. ("Time-travel is a detailed history of all your edits and everything is backed up in consistent snapshots.")
### Timestamps
You must add timestamps by hand; i.e. as #comments or markdown cells.
- [ ] ENH: add a markdown cell with a timestamp (from a configurable template) (with a keyboard shortcut)
### Project files
You must manage the non-.ipynb sources separately. (You can create a new file or folder. You can just drag and drop to upload. You can open a shell tab to `git status diff commit` and `git push`, if the Jupyter/JupyterHub/CoCalc instance has network access to e.g. GitLab or GitHub)
## Advantages
### Reproducibility Executable I/O cells
The version_information and/or watermark extensions will inline the software versions that were installed when the notebook was last run
Dockerfile for OS config
Conda environment.yml (and/or pip requirements.txt and/or pipenv Pipfile) for further software dependencies
BinderHub can rebuild a docker image on receipt of a webhook from a got repo, push the built image to a docker image repository, and then host prepared Jupyter instances (with Kubernetes) which contain (and reproducibly archive) all of the preinstalled prerequisites.
Diff: `git diff`, `nbdime`
### Publishing
You can generate static HTML, HTML slides with RevealJS, interactive HTML slides with RISE, executable source with comments (e.g. a .py file), LaTeX, and PDF with 'Save as' or `jupyter-convert --to`. You can also create slides with nbpresent.
MyBinder.org and Azure Notebooks have badges for e.g. a README.md or README.rst which launch a project executably in a docker instance hosted in a cloud. CoCalc and Anaconda Cloud also provide hosted Jupyter Notebook projects.
You can template a gradable notebook with nbgrader.
GitHub renders .ipynb notebooks as HTML. Nbviewer renders .ipynb notebooks as HTML.
There are more than 90 Jupyter Kernels for languages other than Python.
https://github.com/quobit/awesome-python-in-education#jupyte...
How to teach technical concepts with cartoons
There's not a Wikipedia page for "visual metaphor", but there are pages for "visual rhetoric" https://en.wikipedia.org/wiki/Visual_rhetoric and "visual thinking" https://en.wikipedia.org/wiki/Visual_thinking
Negative space can be both meaningful and useful later on.
I learned about visual thinking and visual metaphor in application to business communications from "The Back of the Napkin: Solving Problems and Selling Ideas with Pictures" http://www.danroam.com/the-back-of-the-napkin/
Fact Checks
Indeed, fact checking systems are only as good as the link between identity credentialing services and a person.
http://schema.org/ClaimReview (as mentioned in this article) is a good start.
A few other approaches to be aware of:
"Reality Check is a crowd-sourced on-chain smart contract oracle system" [built on the Ethereum smart contracts and blockchain]. https://realitykeys.github.io/realitycheck/docs/html/
And standards-based approaches are not far behind:
W3C Credentials Community Group https://w3c-ccg.github.io/
W3C Verifiable Claims Working Group https://www.w3.org/2017/vc/WG/
W3C Verifiable News https://github.com/w3c-ccg/verifiable-news
In terms of verifying (or validating) subjective opinions, correlational observations, and inferences of causal relations; #LinkedMetaAnalyses of documents (notebooks) containing structured links to their data as premises would be ideal. Unfortunately, PDF is not very helpful in accomplishing that objective (in addition to being a terrible format for review with screen reader and mobile devices): I think HTML with RDFa (and/or CSVW JSONLD) is our best hope of making at least partially automated verification of meta analyses a reality.
DHS orders agencies to adopt DMARC email security
From https://www.cyberscoop.com/dhs-dmarc-mandate/ :
> By Jan. 2018, all federal agencies will be required to implement DMARC across all government email domains.
> Additionally, by Feb. 2018, those same agencies will have to employ Hypertext Transfer Protocol Secure (HTTPS) for all .gov websites, which ensures enhanced website certifications.
The electricity for 1BTC trade could power a house for a month
The article seems to imply that a 1BTC transaction requires 200kWh of energy.
First, what is the source for that number?
Second, what is the business interest of the quoted individual? Are they promoting competing services?
Third, how much energy does the supposed alternative really take, by comparison?
How much energy do these aspects of said business operations require:
- Travel to and from the office for n employees
- Dry cleaning for n employees' work clothes
- Lights for an office of how many square feet
- Fraud investigations in hours worked, postal costs, wait times, CPU time and bandwidth to try and fix data silos' ledgers' transaction ids and time skew; with a full table JOIN on data nobody can only have for a little while from over here and over there
- Desktop machines' idle hours
- Server machines' idle hours
With low cost clean energy, these businesses are profitable; with a very different cost structure than traditional banking and trading.
PAC Fundraising with Ethereum Contracts?
I'll cc this here with formatting changes (extra \n and ---) for Hacker News:
---
### Background
- PAC: Political Action Committee https://en.wikipedia.org/wiki/Political_action_committee
- https://github.com/holographicio/awesome-token-sale
### Questions
- Is Civic PAC fundraising similar to e.g. a Crowdsale or a CappedCrowdsale or something else entirely, in terms of ERC20 OpenZeppelin solidity contracts?
- Would it be worth maintaining an additional contract for [PAC] "fundraising" with terminology that campaigns can understand; or a terminology map?
- Compared to just accepting donations at a wallet address, or just accepting credit/debt card donations, what are the risks of a token sale for a PAC?
--- Is there any way to check for donors' citizenship? (When/Where is it necessary to check donors' citizenship (with credit/debit cards or cryptocoins/cryptotokens?))
- Compared to just accepting donations at a wallet address, or just accepting credit/debt card donations, what are the costs of a token sale for a PAC?
--- How much gas would such a contract require?
- Compared to just accepting donations at a wallet address, or just accepting credit/debt card donations, what are the benefits of a token sale for a PAC?
---- Lower transaction fees than credit/debit cards?
---- Time limit (practicality, marketing)
---- Cap ("we only need this much")
---- Refunds in the event of […]
### Objectives
- Comply with all local campaign finance laws
--- Collect citizenship information for a Person
--- Collect citizenship information for an Organization 'person'
- Ensure that donations hold value
- Raise funds
- Raise funds up to a cap
- (Optionally?) collect names and contact information ( https://schema.org/Person https://schema.org/Organization )
- Optionally refund if the cap is not met
- Optionally change the cap midstream
- Optionally cancel for a specified string and/or URL reason
Here’s what you can do to protect yourself from the KRACK WiFi vulnerability
> But first, let’s clarify what an attacker can and cannot do using the KRACK vulnerability. The attacker can intercept some of the traffic between your device and your router. Attackers can’t obtain your Wi-Fi password using this vulnerability. They can just look at your traffic. It’s like sharing the same WiFi network in a coffee shop or airport.
From reading the articles:
( https://github.com/vanhoefm/krackattacks ; which is watch-able )
> Against these encryption protocols, nonce reuse enables an adversary to not only decrypt, but also to forge and inject packets.
https://www.kb.cert.org/vuls/id/228519
> Key reuse facilitates arbitrary packet decryption and injection, TCP connection hijacking, HTTP content injection, or the replay of unicast, broadcast, and multicast frames.
The Solar Garage Door – A Possible Alternative to the Emergency Generator
Is it possible to apply the https://solarwindow.com ($WNDW) glass coating to a non-glass garage door?
Using the Web Audio API to Make a Modem
Ask HN: How to introduce someone to programming concepts during 12-hour drive?
I won't go into details to keep this brief, but I'm going to spend a week with this client of mine's kit, and I'm supposed to teach him enough about programming for him to figure out if it's something he might be interested in pursuing.
He's about 20, and still struggling to finish high school, but he's smart (although perhaps a little weird).
I thought about introducing him to touch typing just to get a useful skill out of this regardless of the outcome. Then, I thought that during this week I'd teach him HTML and enough CSS to see what's used for. I'm thinking that if he gets excited about typing code and seeing things happening he'll want to study more and learn more advanced stuff in the future and perhaps even make it his profession (this is what my client hopes will happen).
Now, part of this trip is a 12-hour drive. I thought I could use this time to introduce him to simple programming concepts. For instance, if asked to list all steps involved in starting a car, most people would say:
- turn key - start car
That could turn into an infinite loop, though. A better way would be:
- turn key - start car - if it starts, exit - if it doesn't start, repeat 3 more times - if it still won't start, call a mechanic
Stuff like this—that anyone can understand, that can be explained without looking at a computer, but that it's still useful.
Any idea what I could talk about? Examples, anecdotes, anything.
Computational Thinking:
https://en.wikipedia.org/wiki/Computational_thinking
> 1. Problem formulation (abstraction);
> 2. Solution expression (automation);
> 3. Solution execution and evaluation (analyses).
This is a good skills matrix to start with:
http://sijinjoseph.com/programmer-competency-matrix/
https://competency-checklist.appspot.com
"Think Python: How to Think Like a Computer Scientist"
http://www.greenteapress.com/thinkpython/html/index.html
K12CS Framework is good for all ages:
For syntax, learnxinmyminutes:
https://learnxinyminutes.com/docs/python3/
https://learnxinyminutes.com/docs/javascript/
Good one, but that's kind of hard to do while driving, though.
To get a job, "Coding Interview University":
https://github.com/jwasham/coding-interview-university
This is actually pretty awesome :-)
Start simple, start small and start with something he's interested in.
There's the part about helping him discover whether he likes to create things through computers and whether he actually believes he can create things through computers. You're spot on that he might be interested about typing code but you'll have to figure out whether he's a visual person, a logical person, etc. For example, I got started learning to code once I understood what code can do to help automate things. A friend of mine got interested after seeing what websites he can build. Everyone is unique so you'll have to learn about him as you're trying to teach.
Good advice. Do you have any tip of how to figure out if he's a visual or logical person, etc.?
You can learn about a person's internal representation by asking Clean Questions and listening to the metaphors that they share; in order to avoid transferring and inferring your own biased internal representation (MAPS: metaphors, assumptions, paradigms or sensations).
It's worth reading this whole article (and e.g. "Clean Language: Revealing Metaphors and Opening Minds")
https://en.wikipedia.org/wiki/Clean_Language
"Metaphors We Live By" explains conceptual metaphor ("internal representation" w/ Clean Language / Symbolic Modeling) and lists quite a few examples: https://en.wikipedia.org/wiki/Conceptual_metaphor
Our human brains tend to infer Given, When, Then "rules" which we only later reason about in terms of causal relations: https://en.wikipedia.org/wiki/Given-When-Then
It's generally accepted that software is more correct when we start with tests:
Given : When : Then :: Precondition : Command : Postcondition https://wrdrd.github.io/docs/consulting/software-development...
... "Criteria for Success and Test-Driven-Development" https://westurner.github.io/2016/10/18/criteria-for-success-...
I believe it was Feynman who introduced the analogy:
desktop : filing cabinet :: RAM : hard drive
Here's a video: "Richard Feynman Computer Heuristics Lecture" (1985) https://youtu.be/EKWGGDXe5MA
Somewhere in my comments here, I talk about topologically sorting CS concepts; in what little time I spent, I think I suggested "Constructor Theory" (Deutsch 201?) as a first physical principle. https://en.wikipedia.org/wiki/Constructor_theory
> Constructor Theory
https://en.wikipedia.org/wiki/Constructor_theory#Outline
Task, Constructor, Computation Set, Computation Medium, Information Medium, Super information Medium (quantum states)
The filing cabinet and disk storage are information mediums / media.
How is the desktop / filling cabinet metaphor mismatched or limiting?
There may be multiple desktops (RAM/Cache/CPU; Computation mediums): is the problem parallelizable?
Consider a resource scheduling problem: there are multiple rooms, multiple projectors, and multiple speakers. Rooms and projectors cost so much. Presenters could use all of an allotted period of time; or they could take more or less time. Some presentations are logically sequence able (SHOULD/MUST be topologically sorted). Some presentations have a limited amount of time for questions afterward.
Solution: put talks online with an infinite or limited amount of time for asynchronous questions/comments
Solution: in between attending a presentation, also research and share information online (concurrent / asynchronous)
And, like a hash map, make the lookup time for a given resource with a type(s) ~O(1) with URLs (URIs) that don't change. (Big-O notation for computational complexity)
Resource scheduling (SLURM,): https://news.ycombinator.com/item?id=15267146
American Red Cross Asks for Ham Radio Operators for Puerto Rico Relief Effort
Zello trended up during hurricane Harvey:
> Push the button for instant, radio-style talk on any Wi-Fi or data plan.
> Access public and private channels.
> Choose button for push-to-talk.
> [...] available for Android, BlackBerry, iPhone, Windows PC and Windows Phone 8
...
> Connects to existing LMR radio systems
> All Radio Technologies
> Interconnect conventional and trunked analog FM, ETSI DMR, ETSI TETRA, MotoTRBO, APCO P25 FDMA, and NXDN.
They probably need some batteries, turbines, and solar cell chargers to get WiFi online?
> This phone needs no battery
http://www.techradar.com/news/this-phone-needs-no-battery
> [...] “We’ve built what we believe is the first functioning cellphone that consumes almost zero power,” said Shyam Gollakota, an associate professor in the Paul G. Allen School of Computer Science & Engineering at the UW and co-author on a paper describing the technology.
> Instead, the phone pulls power from its environment - either from ambient radio signals harvested by an antenna, or ambient light collected by a solar cell the size of a grain of rice. The device consumes just 3.5 microwatts of power during use.
> [...] “And if every house has a Wi-Fi router in it, you could get battery-free cellphone coverage everywhere."
(Also trending on HackerNews right now: https://news.ycombinator.com/item?id=15350799 )
That's not at all useful, nor is it relevant.
Probably also worth mentioning Shelterpods and Responsepods for disaster relief deployments to this crowd; they're designed to take a lot of wind and rain:
https://store.advancedsheltersystemsinc.com/?___store=shelte...
https://store.advancedsheltersystemsinc.com/responsepod/vip/...
Technical and non-technical tips for rocking your coding interview
This is also a great resource, if you're into studying yourself:
"Coding Interview University" https://github.com/jwasham/coding-interview-university
Django 2.0 alpha
Can someone who uses Django, review this release? What are the most anticipated changes etc?
There's new `path()` as alternative to `url()` that allows one to use easier to remember syntax for url params like `<int:id>` instead of past `(?P<year>[0-9]{4})`. But if you need it, there's still `url()` for you to use power of regex in your urls.
And your shop still runs on Py2k, you'll need to move to Py3k to use Django 2.0. This also means that websockets support will not land on Django version that works for Py2k.
</year></int:id>How is <int:id> identical to (?P<year>[0-9]{4})? I'm talking about the {4}. Just an error in their blog post?</year></int:id>
Does it support negative long integers?
EDIT: I am without actual internet or mobile tethering and an unable to `git clone https://github.com/django/django -b stable/2.0.x` and check out this convenient new feature.
It's still regex under the hood, the 'int' is just syntactic sugar for "[0-9]+":
https://github.com/django/django/blob/dda37675db0295baefef37...
Ask HN: What is the best way to spend my time as a 17-year-old who can code?
I'm 17 and I can code at a relatively high level. I'm not really sure what I should be doing. I would like to make some money, but is it more useful to me to contribute to open-source software to add to my portfolio or to find people who will hire me? Even most internships require you to be enrolled as a CS major at a college. I've also tried things like Upwork, but generally people aren't willing to hire a 17-year-old and the pay is very bad. Thanks for any advice!
My GitHub is: https://github.com/meyer9
Pick a #GlobalGoal or three that you find interesting and want to help solve.
Apply Computational Thinking to solving a given problem. Break it down into completeable tasks.
You can work on multiple canvasses at once: sometimes it's helpful to let things simmer on the back burner while you're taking care of business. Just don't spread yourself too thin: everyone deserves your time.
Remember ERG theory (and Maslow's Hierarchy). Health and food and shelter are obviously important.
Keep lists of good ideas. Notecards, git, a nice fresh blank sheet of paper for the #someday folder. What to call it isn't important yet. "Thing1" and "Thing2".
You can spend time developing a portfolio, building up your skills, and continuing education. You can also solve a problem now.
You don't need a co-founder at first. You do need to plan to be part of a team: other people are good at other things; and that's the part they most enjoy doing.
Democrats fight FCC's plans to redefine “broadband” from 25+ to 10+ Mbps
The FCC redefined broadband as 25Mbps down and 3Mbps up as reported in the 2015 Broadband Progress Report (from 4Mbps/1Mbps in 2010).
https://www.fcc.gov/reports-research/reports/broadband-progr...
Ask HN: Any detailed explanation of computer science
Any detailed easily understandable explanation of computer science from bottom-up like Feynman's lectures explanation of physics.
Bits
Boolean algebra
Boolean logic gates / (set theory)
CPU / cache
Memory / storage
Data types (signed integers, floats, decimals, strings), encoding
...
A bottom-up (topologically sorted) computer science curriculum (a depth-first traversal of a Thing graph) ontology would be a great teaching resource.
One could start with e.g. "Outline of Computer Science", add concept dependency edges, and then topologically (and alphabetically or chronologically) sort.
https://en.wikipedia.org/wiki/Outline_of_computer_science
There are many potential starting points and traversals toward specialization for such a curriculum graph of schema:Things/skos:Concepts with URIs.
How to handle classical computation as a "collapsed" subset of quantum computation? Maybe Constructor Theory?
https://en.wikipedia.org/wiki/Constructor_theory
From "Resources to get better at theoretical CS?" https://news.ycombinator.com/item?id=15281776 :
- "Open Source Society University: Path to a self-taught education in Computer Science!" https://github.com/ossu/computer-science
This is also great:
- "Coding Interview University" https://github.com/jwasham/coding-interview-university
Neither these nor the ACM Curriculum are specifically topologically sorted.
Ask HN: What algorithms should I research to code a conference scheduling app
I'm interested in writing a utility to assist with scheduling un-conferences. Lets take the following situation for an example:
* 4 conference rooms across 4 time slots, for a total of 16 talks.
* 30 proposed talks
* 60 total participants
Each user would be given 4(?)votes, un-ranked. (collection of the votes is a separate topic) Voting is not secret, and we don't need mathematically precise results. The goal is just to minimize conflicts.
The algorithm would have the following data to work with:
* List of talks with the following properties:
* presenter participant ID
* the participant ID for each user that voted for the talk
I'd like to come up with an algorithm that does the following:* fills all time slots with the highest voted topics
* attempts to avoid overlapping votes for any particular given user in a given time slot
* attempt to not schedule a presenter's talk during a talk they are interested in.
* Sugar on top: implement ranked preferences
My question: where do I start to research the algorithms that will be helpful? I know this is a huge project, but I have a year to work on it. I'm also not overly concerned with performance, but would like to keep it from being exponential.
Thank you for any references you can provide!
Resource scheduling, CSP (Constraint Satisfaction programming)
CSP: https://en.wikipedia.org/wiki/Constraint_satisfaction_proble...
Scheduling (production processes):
https://en.wikipedia.org/wiki/Scheduling_(production_process...
Scheduling (computing):
https://en.wikipedia.org/wiki/Scheduling_(computing)
... To an OS, a process thread has a priority and sometimes a CPU affinity.
What have been the greatest intellectual achievements?
- The internet (TCP/IP) and world wide web (HTML, HTTP).
History of the Internet:
https://en.wikipedia.org/wiki/History_of_the_Internet
History of the World Wide Web:
https://en.wikipedia.org/wiki/History_of_the_World_Wide_Web
- Relational algebra, databases, Linked Data (RDF,).
Relational algebra:
https://en.wikipedia.org/wiki/Relational_algebra
Relational database:
https://en.wikipedia.org/wiki/Relational_database
Linked Data:
https://en.wikipedia.org/wiki/Linked_data
RDF:
https://en.wikipedia.org/wiki/Resource_Description_Framework
- CRISPR/Cas9, CRISPR/Cpf1
https://en.wikipedia.org/wiki/CRISPR
Cas9:
https://en.wikipedia.org/wiki/Cas9
CRISPR/Cpf1:
https://en.wikipedia.org/wiki/CRISPR/Cpf1
- Tissue Nanotransfection
- Time, Calendars
Time > History of the calendar: https://en.wikipedia.org/wiki/Time#History_of_the_calendar
- Standard units of measure (QUDT URIs)
How about Tesla?!
Nikola Tesla > AC (alternating current) and the induction motor:
https://en.wikipedia.org/wiki/Nikola_Tesla#AC_and_the_induct...
Induction motor:
Ask HN: What can't you do in Excel? (2017)
Was just Googling around for whether Excel (sans VBA scripting of course) is Turing-complete, in order to decide whether telling a layperson that Excel (or spreadsheeting in general) can be considered very much like programming. Came across this 2009 HN thread, "Ask HN: What can't you do in Excel?" from pg:
> One of the startups in the current YC cycle is making a new, more powerful spreadsheet. If there are any Excel power users here, could you please describe anything you'd like to be able to do that you can't currently? Your reward could be to have some very smart programmers working to solve your problem.
https://news.ycombinator.com/item?id=429477
What significant advances -- in Excel/spreadsheets, not the Turing-complete thing -- have been made in the 8 years since? What's the YC startup from that cycle that "is making a new, more powerful spreadsheet", and what is it doing today? I remember Grid [0], but that was from 2012. Any other companies make innovations that would overturn the spreadsheet paradigm, or at least be copied by Excel/OO/GSheets?
A commenter mentioned "Queries", since many spreadsheet users use spreadsheets like a database. I just recently noticed that GSheets has a QUERY function [1] that uses "principles of Structured Query Language (SQL) to do searches). The function has been around since 2015 (according to Internet Archive [2]) so perhaps I ignored it because its description then was simply, "Runs a Google Visualization API Query Language query across data."
It appears that "Visualization API Query Language" has a lot of SQL-type features with the immediately obvious exception of joins [3].
edit: Multiple people said they would like Excel to have online functionality, i.e. like Google Sheets, but being able to accept VBA and any other features of legacy Excel spreadsheets. There's now Excel Online but I haven't used it (still sticking to Office 2011 for Mac if I ever need to use Excel instead of GS). How seamless is the transition from offline, legacy Excel files to online Excel?
[0] http://blog.ycombinator.com/grid-yc-s12-reinvents-the-spreadsheet-for-the/
[1] https://support.google.com/docs/answer/3093343?hl=en
[2] http://web.archive.org/web/20150319144449/https://support.google.com/docs/answer/3093343?hl=en
[3] https://developers.google.com/chart/interactive/docs/querylanguage
Topological sort on cells based on formula references.
Excel sheets are often highly convoluted in cell cross-references in formulas. It would help to have a clean-up mechanism that performs a topological sort on all the cells with formulas and puts them in a more natural order. It would help to be able to identify the backward references even if the cells are not automatically rearranged.
+1 for Topological sort of formulas (e.g. into a Jupyter notebook; as e.g. Python)
https://www.reddit.com/r/statistics/comments/2arevn/how_to_u...
https://www.reddit.com/r/personalfinance/comments/1739oc/imp...
[deleted]
The real data structure for financial reporting and analysis is hyperdimensional, like it or not:
https://en.wikipedia.org/wiki/XBRL
After a 15 year struggle, digital financial reports are a success in the U.S and the Europeans are following up.
---
Also, many people use a spreadsheet when they really want a database. In the office world that would be Access instead of Excel; I like the idea of Access but the implementation is an uncomfortable place of having a complex GUI and having to know some SQL.
---
Finally, decimal arithmetic. Financial calculations should not have the artifacts that come from trying to represent (1/100)th in base 2.
W3C RDF Data Cubes (qb:)
https://wrdrd.github.io/docs/consulting/knowledge-engineerin...
> RDF Data Cubes vocabulary is an RDF standard vocabulary for expressing linked multi-dimensional statistical data and aggregations.
> Data Cubes have dimensions, attributes, and measures
> Pivot tables and crosstabulations can be expressed with RDF Data Cubes vocabulary
And then SDMX is widely used internationally:
https://github.com/pandas-dev/pandas/issues/3402#issuecommen...
Linked Data.
> [...] 7 metadata header rows (column label, property URI path, DataType, unit, accuracy, precision, significant figures)
https://wrdrd.github.io/docs/consulting/linkedreproducibilit...
Specifically, CSVW JSONLD as a lossless output format.
CSVW supports physical units.
https://twitter.com/westurner/status/901990866704900096
> "Model for Tabular Data and Metadata on the Web" (#JSONLD, #RDFa HTML) is for Data on the Web #dwbp #linkeddata https://www.w3.org/TR/tabular-data-model/
> #CSVW defaults to xsd:string if unspecified. "How do you support units of measure?" #qudt https://www.w3.org/TR/tabular-data-primer/#units-of-measure
Elon Musk Describes What Great Communication Looks Like
" The world is flat! "*
https://en.wikipedia.org/wiki/The_World_Is_Flat
Check out Thomas L. Friedman (@tomfriedman): https://twitter.com/tomfriedman
Great Ideas in Theoretical Computer Science
List of important publications in computer science https://en.wikipedia.org/wiki/List_of_important_publications...
https://github.com/papers-we-love/papers-we-love#other-good-...
Ask HN: How do you, as a developer, set measurable and actionable goals?
I see a lot of people from other industries, say designers or sales people, who can set for themselves actionable and measurable goals such as "Make one illustration a day", "Make a logo a day" or "Sell X units of Y product a day", "Make X ammount of dollars seeling product Z by date X", etc.
How do you, as a developer, set measurable goals for yourself, being it at work or in your side hobbie?
some companies use some kind of agile methodology to manage the work.
https://en.wikipedia.org/wiki/Agile_software_development
two of the most widely used being Scrum and Kanban
which could be used to track how your doing against your self. although recommended not to be used for evaluating employees so take with a grain of salt.
Burn down chart (each story has complexity points; making it possible to estimate velocity and sprint deadlines):
https://en.wikipedia.org/wiki/Burn_down_chart
User stories in a "story map" (Kanban board) with labels and/or milestones for epics, flights, themes:
https://en.wikipedia.org/wiki/User_story#Story_map
Software Development > Requirements Management > Agile Modeling > User Story: https://wrdrd.github.io/docs/consulting/software-development...
Bitcoin Energy Consumption Index
... Speaking of environmental externalities,
In the US, "Class C" fire extinguishers work on electrical fires:
From Fire_class#Electrical:
https://en.wikipedia.org/wiki/Fire_class#Electrical
> Carbon dioxide CO2, NOVEC 1230, FM-200 and dry chemical powder extinguishers such as PKP and even baking soda are especially suited to extinguishing this sort of fire. PKP should be a last resort solution to extinguishing the fire due to its corrosive tendencies. Once electricity is shut off to the equipment involved, it will generally become an ordinary combustible fire.
> In Europe, "electrical fires" are no longer recognized as a separate class of fire as electricity itself cannot burn. The items around the electrical sources may burn. By turning the electrical source off, the fire can be fought by one of the other class of fire extinguishers [citation needed].
How does this compare to carbon-intensive resource extraction operations like gold mining?
(Gold is industrially and medically useful, IIUC)
See also:
"So, clean energy incentives" https://news.ycombinator.com/item?id=15070430
Dancing can reverse the signs of aging in the brain
"Dancing or Fitness Sport? The Effects of Two Training Programs on Hippocampal Plasticity and Balance Abilities in Healthy Seniors"
Front. Hum. Neurosci., 15 June 2017 | https://doi.org/10.3389/fnhum.2017.00305
Adult neurogenesis:
https://en.wikipedia.org/wiki/Adult_neurogenesis
IIUC:
{Omega 3/6, Cardiovascular exercise,} -> Endocannabinoids -> [Hippocampal,] neurogenesis
"Neurobiological effects of physical exercise" (Hippocampal plasticity, neurogenesis,)
https://en.wikipedia.org/wiki/Neurobiological_effects_of_phy...
"Study: Omega-3 fatty acids fight inflammation via cannabinoids" https://news.illinois.edu/blog/view/6367/532158 (Omega 6: Omega 3 ratio)
scholar.google q=cannabinoid+neurogenesis https://scholar.google.com/scholar?q=cannabinoid+neurogenesi...
Functions of the ECS (Endocannabinoid System):
https://en.wikipedia.org/wiki/Endocannabinoid_system#Functio...
- #Role-in-hippocampal-neurogenesis, "runners high"
Rumours swell over new kind of gravitational-wave sighting
This might be a naive question, but I'd like to clear it out.
Do gravitational waves travel at the speed of light?
I know the theory says nothing can travel faster than light. I also know that photons can be seen as quanta or as waves. So my guess is that gravitational waves travel at most, at the speed of light.
But do they? or do they travel slower? faster? Is there a doppler effect for GWs?
I ask because I would think ripples in the space-time fabric itself might be a bit different than light waves or other more studied phenomena.
Can anyone point me in the right direction?
These PBS Spacetime episodes should help:
The Speed of Light is NOT About Light - https://www.youtube.com/watch?v=msVuCEs8Ydo
Is Quantum Tunneling Faster than Light? - https://www.youtube.com/watch?v=-IfmgyXs7z8
The Quantum Experiment that Broke Reality - https://www.youtube.com/watch?v=RlXdsyctD50
Pilot Wave Theory and Quantum Realism - https://www.youtube.com/watch?v=RlXdsyctD50
The Future of Gravitational Waves - https://www.youtube.com/watch?v=eJ2RNBAFLj0
Thanks!
"Neutron star": https://en.wikipedia.org/wiki/Neutron_star
New Discovery Simplifies Quantum Physics
OpenAI has developed new baseline tool for improving deep reinforcement learning
https://blog.openai.com/openai-baselines-dqn/ (May 2017)
Deep Learning RL (Reinforcement Learning) algos in this batch of OpenAI RL baselines: DQN, Double Q Learning, Prioritized Replay, Dueling DQN
Src: https://github.com/openai/baselines
[deleted]
https://blog.openai.com/baselines-acktr-a2c/ (August 2017)
ACKTR & A2C (~=A3C)
(The GitHub readme lists: A2C, ACKTR, DDPG, DQN, PPO, TRPO)
... openai/baselines/commits/master: https://github.com/openai/baselines/commits/master
The prior can generally only be understood in the context of the likelihood
Bayes assumes/requires conditional independence of observations; which is sometimes the case.
For example:
- Are the positions of the Earth and the Moon conditionally independent? No.
- In the phrase "the dog and the cat", are "and" and "the" independent? No.
- In a biological system, are we to assume conditional independence? We should not.
https://en.wikipedia.org/wiki/Conditional_independence
...
"Efficient test for nonlinear dependence of two continuous variables" https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4539721/
- In no particular sequence: CANOVA, ANOVA, Pearson, Spearman, Kendall, MIC, Hoeffding
From https://plato.stanford.edu/entries/logic-inductive/ :
> It is now generally held that the core idea of Bayesian logicism is fatally flawed—that syntactic logical structure cannot be the sole determiner of the degree to which premises inductively support conclusions. A crucial facet of the problem faced by Bayesian logicism involves how the logic is supposed to apply to scientific contexts where the conclusion sentence is some hypothesis or theory, and the premises are evidence claims. The difficulty is that in any probabilistic logic that satisfies the usual axioms for probabilities, the inductive support for a hypothesis must depend in part on its prior probability. This prior probability represents how plausible the hypothesis is supposed to be based on considerations other than the observational and experimental evidence (e.g., perhaps due to relevant plausibility arguments). A Bayesian logicist must tell us how to assign values to these pre-evidential prior probabilities of hypotheses, for each of the hypotheses or theories under consideration. Furthermore, this kind of Bayesian logicist must determine these prior probability values in a way that relies only on the syntactic logical structure of these hypotheses, perhaps based on some measure of their syntactic simplicities. There are severe technical problems with getting this idea to work. Moreover, various kinds of examples seem to show that such an approach must assign intuitively quite unreasonable prior probabilities to hypotheses in specific cases (see the footnote cited near the end of section 3.2 for details). Furthermore, for this idea to apply to the evidential support of real scientific theories, scientists would have to formalize theories in a way that makes their relevant syntactic structures apparent, and then evaluate theories solely on that syntactic basis (together with their syntactic relationships to evidence statements). Are we to evaluate alternative theories of gravitation (and alternative quantum theories) this way?
>"This prior probability represents how plausible the hypothesis is supposed to be based on considerations other than the observational and experimental evidence (e.g., perhaps due to relevant plausibility arguments)."
I guess I don't know about how "Bayesian logicism" differs from "Bayesian probability", but this is totally false in the latter case. The prior is just supposed to be independent of the current data (eg devised before it was collected). In practice info almost always leaks into the prior + model via tinkering. That is why a priori predictions are so important to proving you are onto something.
Bayesian logicism is the logic derived from Bayesian probability.
Magic numbers are an anti-pattern: which constants are what and why should be justified OR it should be shown that a non-expert-biased form converges regardless.
The use of the term prior probability in that paragraph is not consistent with its use in bayesian probability, so something is wrong.
Also, I am not sure what magic numbers you are referring to.
Ask HN: How to find/compare trading algorithms with Quantopian?
I found this, which links to a number of quantitative trading algorithms that significantly outperform as compared with SPY (an S&P 500 ETF):
"Community Algorithms Migrated to Quantopian 2"
https://www.quantopian.com/posts/community-algorithms-migrat...
Why even build a business, create jobs, and solve the world's problems?
... "Impact investing"
https://en.wikipedia.org/wiki/Impact_investing
"Is this a good way to invest in solving for the #GlobalGoals for Sustainable Development ( https://GlobalGoals.org )?"
Ask HN: How do IPOs and ICOs help a business raise capital?
Ask HN: How do IPOs and ICOs help a business raise capital?
IPO: "Initial Public Offering"
https://en.wikipedia.org/wiki/Initial_public_offering
ICO: "Initial Coin Offering"
https://en.wikipedia.org/wiki/Initial_coin_offering
"""
IPO: "Initial Public Offering"
https://en.wikipedia.org/wiki/Initial_public_offering
ICO: "Initial Coin Offering"
https://en.wikipedia.org/wiki/Initial_coin_offering
"""
Solar Window coatings “outperform rooftop solar by 50-fold”
MS: Bitcoin mining uses as much electricity as 1M US homes
So, clean energy incentives.
> That means 1.2% of the Sahara desert is sufficient to cover all of the energy needs of the world in solar energy.
https://www.forbes.com/sites/quora/2016/09/22/we-could-power...
Nearly all other animals on the planet survive entirely on solar energy.
Ask HN: What are your favorite entrepreneurship resources
Hey everyone, I'm teaching an undergraduate class in the fall at a local university here in Miami (FIU) and would love your recommendations on what books or articles or frameworks you think the students should read. My goal for the class is to teach them how to identify problems and prototype solutions for those problems. Hopefully, they make some money from them to help pay for books, etc.
I put these notes together:
Entrepreneurship: https://wrdrd.github.io/docs/consulting/entrepreneurship
- #plan-for-failure
- #plan-for-success
Investing > Capitalization Table: https://wrdrd.github.io/docs/consulting/investing#capitaliza...
- I'll add something about Initial Coin Offerings (which are now legal in at least Delaware).
AngelList ( https://angel.co for VC jobs and funding ) asks "What's the most useful business-related book you've ever read?" ... Getting Things Done (David Allen), 43Folders = 12 months + 31 days (Merlin Mann), The Art of the Start (Guy Kawasaki), The Personal MBA (Josh Kaufman)
Lever ( https://www.lever.co ) makes recruiting and hiring (some parts of HR) really easy.
LinkedIn ( https://www.linkedin.com ) also has a large selection of qualified talent: https://smallbusiness.linkedin.com/hiring
... How much can you tell about a candidate from what they decide to write on themselves on the internet?
USA Small Business Administration: "10 steps to start your business." https://www.sba.gov/starting-business/how-start-business/10-...
"Startup Incorporation Checklist: How to bootstrap a Delaware C-corp (or S-corp) with employee(s) in California" https://github.com/leonar15/startup-checklist
Jupyter Notebook (was: IPython Notebook) notebooks are diff'able and executable. Spreadsheets can be hard to review. https://github.com/jupyter/notebook
It's now installable with one conda command: ``conda install -y notebook pandas qgrid``
CPU Utilization is Wrong
Instructions per cycle: https://en.wikipedia.org/wiki/Instructions_per_cycle
What does IPC tell me about where my code could/should be async so that it's not stalled waiting for IO? Is combined IO rate a useful metric for this?
There's an interesting "Cost per GFLOPs" table here: https://en.wikipedia.org/wiki/FLOPS
Btw these are great, thanks: http://www.brendangregg.com/linuxperf.html
( I still couldn't fill this out if I tried: http://www.brendangregg.com/blog/2014-08-23/linux-perf-tools... )
It's not that your code should be async, but that it should be more cache-friendly. It's probably mostly stalled waiting for RAM.
Ask HN: Can I use convolutional neural networks to clasify videos on a CPU
Is there any way that I can use conv nets to classify videos on a CPU. I do not have GPUs but I want to classify videos.
There's a table with runtime comparisons for a convnet here: https://github.com/ryanjay0/miles-deep/ (GPU CuDNN: 15s, GPU: 19s, CPU: 159s)
(Also written w/ Caffe: https://github.com/yahoo/open_nsfw)
Esoteric programming paradigms
Re: "Dependent Types"
In Python, PyContracts supports runtime type-checking and value constraints/assertions (as @contract decorators, annotations, and docstrings).
https://andreacensi.github.io/contracts/
Unfortunately, there's yet no unifying syntax between PyContracts and the newer python type annotations which MyPy checks at compile-type.
https://github.com/python/typeshed
What does it mean for types to be "a first class member of" a programming language?
Reasons blog posts can be of higher scientific quality than journal articles
So, schema.org, has classes (C:) -- subclasses of CreativeWork and Article -- for property (P:) domains (D:) and ranges (R:) which cover this domain:
- CreativeWork: http://schema.org/CreativeWork
- - BlogPosting: http://schema.org/BlogPosting
- - Article: http://schema.org/Article
- - - NewsArticle: http://schema.org/NewsArticle
- - - Report: http://schema.org/Report
- - - ScholarlyArticle: http://schema.org/ScholarlyArticle
- - - SocialMediaPosting: http://schema.org/SocialMediaPosting
- - - TechArticle: http://schema.org/TechArticle
Thing: (name, [url], [identifier], [#about], [description[_gh_markdown_html]])
- C: CreativeWork:
- - P: comment R: Comment
- - C: Comment: https://schema.org/Comment
Fact Check now available in Google Search and News
So, publishers can voluntarily add https://schema.org/ClaimReview markup as RDFa, JSON-LD, or Microdata.
Ask HN: Is anyone working on CRISPR for happiness?
"studies have found that genetic influences usually account for 35-50% of the variance in happiness measures"
No doubt there are many reasons why this is extremely complicated
Roadmap to becoming a web developer in 2017
Nice.
- https://github.com/fkling/JSNetworkX would be a cool way to build interactive schema:Thing/CreativeWork curriculum graph visualizations (and BFS/DFS traversal)
- #WebSec: https://wrdrd.com/docs/consulting/web-development#websec
- Web Development Checklist: https://wrdrd.com/docs/consulting/web-development#web-develo...
-- http://webdevchecklist.com/
- | Web Frameworks (GitHub Sphinx wiki (./Makefile)): https://westurner.org/wiki/webframeworks (| Wikipedia, | Homepage, Source, Docs,)
Ask HN: How do you keep track/save your learnings?(so that you can revisit them)
- Vim Voom: `:Voom rest` , ':Voom markdown`
- Jupyter notebooks
- Sphinx docs: https://wrdrd.com/docs/consulting/research#research-tools src: https://github.com/wrdrd/docs/blob/master/docs/consulting/re...
- Sphinx wiki (./Makefile):
-- Src: https://github.com/westurner/wiki
-- Src: https://github.com/westurner/wiki/wiki
-- Web: https://westurner.org/wiki/workflow
interesting stuff, thanks for sharing!
Use a spaced repetition flashcard program like Anki or Mnemosyne or one of the others (https://en.wikipedia.org/wiki/List_of_flashcard_software) and use it on a regular basis, preferably every day.
[deleted]
Ask HN: Criticisms of Bayesian statistics?
In tech circles, it seems that Bayesian statistics is often favored over classical frequentist statistics. In my study of both Bayesian and frequentist statistics, it seems that the results of a Bayesian analysis are generally more intuitive, such as when comparing Bayesian credible intervals to frequentist confidence intervals. It also seems like Bayesian analysis avoids what I think is one of the most serious problems in analysis, the multiple comparisons problem. It's been easy for me to find any number of Bayesian critiques of frequentist stats, but I have rarely seen frequentist defenses against Bayesian stats. This may simply be because I mostly read technology related sites as opposed to more general statistics oriented sites. As such, I would really appreciate hearing some frequentist critiques of Bayesian stats. I feel like the situation can't be as cut and dry as one being better than the other in all things, so I would like to acquire a more balanced perspective by hearing about the other side. Thanks!
~bayesian logicism
https://plato.stanford.edu/entries/logic-inductive/ :
> It is now generally held that the core idea of Bayesian logicism is fatally flawed—that syntactic logical structure cannot be the sole determiner of the degree to which premises inductively support conclusions. [...]
80,000 Hours career plan worksheet
> What are your best medium-term options (3-15 years)?
> 1. What global problems do you think are most pressing?
The 17 UN Sustainable Development Goals (SDGs) and 169 targets w/ statistical indicators, AKA GlobalGoals, are for the whole world through 2030.
https://en.wikipedia.org/wiki/Sustainable_Development_Goals
"Schema.org: Mission, Project, Goal, Objective, Task" https://news.ycombinator.com/item?id=12525141 could make it easy to connect our local, regional, national, and global goals; and find people with similar objectives and solutions.
World's first smartphone with a molecular sensor is coming in 2017
> Looking at the back of the phone, you'd be forgiven for thinking the sensor is just the phone's camera. But that odd-looking dual lens is the scanner, basically the embedded version of the SCiO. It uses spectrometry to shine near-infrared light on objects — fruit, liquids, medicine, even your body — to analyze them.
> Say you're at at the supermarket and you want to check how fresh the tomatoes are. Instead of squeezing them, you'd just launch the SCiO app, hold the scanner up to the skin of the tomato, and it will tell you how fresh it is on a visual scale. Do the same thing to your body and you can check your body mass index (BMI). You need to specify the thing you're scanning at the outset, and the actually analysis is performed in the cloud, but the whole process is a matter of seconds, not minutes.
https://en.wikipedia.org/wiki/Spectroscopy
... Tricorder X PRIZE: https://en.wikipedia.org/wiki/Tricorder_X_Prize
Ask HN: How would one build a business that only develops free software?
So I was reading Richard Stallman's blog on why you should not use google/uber/apple/twitter etc and I understand his reasoning. But what I don't understand is how would one go about building a startup or business that develops and distributes free software only and make good money doing so?
For example, would it be possible to build a free software version of uber/twitter/facebook etc? How would that work?
By removing all restrictions on the software, what is the incentive to not pirate the software? The GPL can be enforced, but that is clearly not practical especially outside the US.
A business is more than just the source code. The source for Reddit, for example, is OSS. So anybody should be able to put Reddit out of business in a few days, right? But yet... nobody has. Hmmm....
So yeah, you can distribute source code and still make money. Lots of companies do it. Red Hat, SugarCRM, Alfresco, etc.
In some ways it's probably even easier for an online service to both be OSS and be successful, exactly because it takes all the other "stuff" (hosting, devops, marketing, network effects, etc.) to be successful. And, in fact, there are OSS replacements for things like Facebook and Twitter. The problem is, very few people use them for whatever reason (probably mostly network effects).
So at least in regards to the Facebooks, Twitters, etc. of the world, the first question you'd have to answer, is how to get people to switch to your service, whether it's free software or otherwise.
> The source for Reddit [...]
Src: https://github.com/reddit/reddit /blob/master/r2/setup.py
Docs: https://github.com/reddit/reddit/wiki/Install-guide
"Reddit Enhancement Suite (RES)" is donationware: https://github.com/honestbleeps/Reddit-Enhancement-Suite
"List of Independent GNU social Instances" http://skilledtests.com/wiki/List_of_Independent_GNU_social_...
> [...] the first question you'd have to answer, is how to get people to switch to your service, whether it's free software or otherwise.
"Growth hacking": https://en.wikipedia.org/wiki/Growth_hacking
"Business models for open-source software" https://en.wikipedia.org/wiki/Business_models_for_open-sourc...
...
- https://github.com/GoogleCloudPlatform
- https://github.com/kubernetes/kubernetes (Apache 2.0)
- https://github.com/apple (Swift is Apache 2.0)
- https://github.com/microsoft
- https://github.com/twitter/innovators-patent-agreement
...
- "GNU Social" (GNU AGPL v3) https://en.wikipedia.org/wiki/GNU_social
... http://choosealicense.com/appendix/ has a table for comparison of open source software licenses.
http://tinyurl.com/p6mka3k describes Open Source Governance in a chart with two axes (Cathedral / Bazaar , Benevolent Dictator / Formal Meritocracy) ... as distinct from https://en.wikipedia.org/wiki/Open-source_governance , which is the application of open source software principles to government. USDS Playbook advises "Default to open" https://playbook.cio.gov/#play13
Anarchy / Budgeting: https://github.com/WhiteHouse/budgetdata
Ask HN: If your job involves continually importing CSVs, what industry is it?
I was wondering if people still use CSVs for data exchange now, or if we've mostly moved to JSON and XML.
Arguing for the CSVW (CSV on the Web) W3C Standards:
- "CSV on the Web: A Primer" http://w3c.github.io/csvw/primer/
- Src: https://github.com/w3c/csvw
- Columns have URIs (ideally from a shared RDFS/OWL vocabulary)
- Columns have XSD datatype URIs
- CSVW can be represented as RDF, JSON, JSONLD
With CSV, which extra metadata file describes how many rows at the top are for columnar metadata? (I.e. column labels, property URI, XSD datatype URI, units URI, precision, accuracy, significant figures) ... https://wrdrd.com/docs/consulting/linkedreproducibility#csv-...
... CSVW: https://wrdrd.com/docs/consulting/knowledge-engineering#csvw
@prefix csvw: <http: csvw#="" ns="" <a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="<a href="http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">http://www.w3.org" rel="nofollow noopener" target="_blank">www.w3.org=""> .
</http:>
@context: http://www.w3.org/ns/csvw.jsonldAsk HN: Maybe I kind of suck as a programmer – how do I supercharge my work?
I'm in my late twenties and I'm having a bit of a tough time dealing with my level of programming skill.
Over the past 3 years, I've released a few apps on iOS: not bad, nothing that would amaze anyone here. The code is generally messy and horrible, rife with race conditions and barely holding together in parts. (Biggest: 30k LOC.) While I'm proud of my work — especially design-wise — I feel most of my time was spent on battling stupid bugs. I haven't gained any specialist knowledge — just bloggable API experience. There's nothing I could write a book about.
Meanwhile, when I compulsively dig through one-man frameworks like YapDatabase, Audiobus, or AudioKit, I am left in awe! They're brimming with specialist knowledge. They're incredibly documented and organized. Major features were added over the course of weeks! People have written books about these frameworks, and they were created by my peers — probably alongside other work. Same with one-man apps like Editorial, Ulysses, or GoodNotes.
I am utterly baffled by how knowledgeable and productive these programmers are. If I'm dealing with a new topic, it can take weeks to get a lay of the land, figure out codebase interactions, consider all the edge cases, etc. etc. But the commits for these frameworks show that the devs basically worked through their problems over mere days — to say nothing of getting the overall architecture right from the start. An object cache layer for SQL? Automatic code gen via YAML? MIDI over Wi-Fi? Audio destuttering? Pff, it took me like a month to add copy/paste to my app!
I'm in need of some recalibration. Am I missing something? Is this quality of work the norm, or are these just exceptional programmers? And even if they are, how can I get closer to where they're standing? I don't want to wallow in my mediocrity, but the mountain looks almost insurmountable from here! No matter the financial cost or effort, I want to make amazing things that sustain me financially; but I can't do that if it takes me ten times as long to make a polished product as another dev. How do I get good enough to consistently do work worth writing books about?
For identifying strengths and weaknesses: "Programmer Competency Matrix":
- http://sijinjoseph.com/programmer-competency-matrix/
- https://competency-checklist.appspot.com/
- https://github.com/hltbra/programmer-competency-checklist
... from: https://wrdrd.com/docs/consulting/software-development#compu... )
> How do I get good enough to consistently do work worth writing books about?
- These are great reads: "The Architecture of Open Source Applications" http://aosabook.org/en/
- TDD.
[deleted]
Ask HN: Anything Like Carl Sagan's Cosmos for Computer Science?
Is there anything like Carl Sagan's Cosmos that talks about the history of computing in an accessible way? Pondering Christmas gifts for my niece.
Computer #History: https://en.wikipedia.org/wiki/Computer
Outline of Computer Engineering #History of: https://en.wikipedia.org/wiki/Outline_of_computer_engineerin...
History of Computer Science: https://en.wikipedia.org/wiki/History_of_computer_science
Outline of Computer Science: https://en.wikipedia.org/wiki/Outline_of_computer_science
History of the Internet: https://en.wikipedia.org/wiki/History_of_the_Internet
History of the World Wide Web: https://en.wikipedia.org/wiki/History_of_the_World_Wide_Web
... maybe a bit OT; but, interestingly, IDK if any of these include a history section:
#K12CSFramework (Practices, Concepts): https://k12cs.org
- "Impacts of Computing" (Culture; Social Interactions; Safety, Law, and Ethics): https://k12cs.org/framework-statements-by-progression/#jump-...
"Competencies and Tasks on the Path to Human-Level AI" (Perception, Actuation, Memory, Learning, Reasoning, Planning, Attention, Motivation, Emotion, Modeling Self and Other, Social Interaction, Communication, Quantitative, Building/Creation): http://wiki.opencog.org/w/CogPrime_Overview#Competencies_and...
Code.org (#HourOfCode): https://code.org/learn
Not really in keeping with the idea of a gift suitable for a niece.
No, but particulary more comprehensive and informative than any one video. These links (to #OER) would be useful for anyone intending to try and replicate the form and style of the "Cosmos" video series with Computer Science content.
Cosmos was also a dead tree book. [1] It was not uncommonly given as a gift.
The original TV series was broadcast the same year as its publication, 1980, but I don't think it was readily available on consumer tape until several years later and then not at normal holiday gift prices. Back in those days, most video libraries were built by the librarian directly recording broadcasts. But most people would just wait for a rebroadcast.
Learn X in Y minutes
this has been shared/posted a hundred times before... nothing new here.
Eh, I can't honestly tell if there has been more X's added since the last time it was posted and I don't feel like putting in the time into the wayback machine to find out.
The source is hosted on GitHub; there's a commit log (for each file and directory): https://github.com/adambard/learnxinyminutes-docs/commits/ma...
Org mode 9.0 released
I love Org mostly for its ability to link to stuff. In my mind it's the big feature that sets it apart from using, say, a separate application to do my task management. For instance:
- When I'm reading email (in emacs), I can quickly create a TODO that links back to the current email I'm reading (most of my TODOs, in fact, link to an email, so this is very useful),
- For my org file where I keep notes about the servers I'm running, I might link to a specific line on a remote apache config,
- For a bug report I might link to a specific git commit in a project to look at later
Any TODO that is important gets scheduled, so that it is linked to from the agenda view. In this way it's very hard for me to lose track of anything, despite most of my work communication happening through email. In fact I now prefer email over using something like Basecamp, because org makes it easier for me to manage!
Because Emacs is an OS the way it should be — an extensible system with a unified UI instead of a pile of separate and hardly interoperable applications.
"b-but the UNIX philosophy!"
The more time I spend with UNIX and hear the "UNIX philosophy" used in place of argument when someone doesn't like the feature set of some software, the more I realize that we need to move on from UNIX. We should take more ideas from things like Emacs and the Lisp machines when designing our software, and less from UNIX.
Emacs has very few self-contained mega packages; in its own way, Emacs software is built by chaining together smaller packages. Many packages are little more than extensions to other packages.
But their interop is better. Besides Emacs packages having much greater "contact surface" (packages can augment other packages, if needed), the biggest problem with UNIX philosophy is using unstructured text as exchange medium. It leads to huge waste and proliferation of bugs, as each tool or script you use has to have its own custom (and bug-ridden) parser, and outputs its own unstructured and usually undocumented data format. This is one of those moments the history of computing took a huge step backwards. All that recent push for passing JSON around instead of plaintext? That's just rediscovering what we've lost when UNIX came along.
Filenames may contain newlines. JSON strings may contain newlines.
The modular aspects of the UNIX philosophy are pretty cool; the data interchange format (un-typed \n-delimited strings) is irrational (and
dangerous).
JSON w/ a JSONLD @context and XSD type URIs may also contain newlines (which should be escaped)
Note that, with OSX bash, tab \t must be specified as $'\t'.
And, sometimes, it's \r\n instead of just \n (which is extra-format metadata).
And then Unicode. Oh yeah, unicodë.
You don't have to use $'', you can also use literal tabs (ctrl-v to insert literals). The main difference between macOS and Linux that people notice is that macOS sed doesn't itself interpret \t (so you have to use literal tabs or $'' there).
\r\n is Windows, not Unix.
What about Unicode? (Btw, UTF-8 was created by unixers Rob Pike and Ken Thompson https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt )
Ask HN: Best Git workflow for small teams
I have been building up a small team of programmers that are coming from CVS. I am looking for some ideas on ideal workflows.
What do you currently use for teams of 5-10 people?
We've been using the Github flow (or some variation of it) in teams of 2-10 people for a few years now. We work on feature branches, merge after code review using the github web UI. Here's a few things that help us:
- Make a rule that anything that's in master can be deployed by anyone at any time. This will enforce discipline for code review, but also for introducing changes that f.e. require migrations. You'll use feature flags more, split db schema changes into several deployments, make code work with both schema version, etc. All good practices that you'll need anyway when you reach larger scale and run into issues with different versions of code running concurrently during deployments.
- If you're using Github, check out the recently added options for controlling merging. Enable branch protection for master, so that only pull request that are green on CI and have been reviewed can be merged. Enable different dropdown options for the merge button (eg rebase and merge), these are useful for small changes.
- It takes some time to get used to those constraints. Make sure that everyone is on board, and do a regular review of the workflow just as you would review your code. It takes some time to get used to, but I think it's worth it in the long run.
+1 for HubFlow (GitFlow > HubFlow).
- https://westurner.org/tools/#hubflow
- Src: https://github.com/datasift/gitflow
- Docs: https://datasift.github.io/gitflow/
-- The git branch diagrams (originally from GitFlow) are extremely helpful: https://datasift.github.io/gitflow/IntroducingGitFlow.html
TDD Doesn't Work
tl;dr = someone did a study that used a methodology that confirmed that working in small chunks and writing tests as you go is good, but that it's not very important if you write the tests before the small chunk of code or after the small chunk of code.
Aren't tests supposed to be a tool to help design API? in that perspective a test should be written first. The problem IMHO is the choice of methodology as there is several kind of tests. Some may be more time consuming when it comes to the set up.
Personally, I think unit tests shine best when you're designing an API. I can swing from hate to love and back about TDD in minutes, but when it comes to thinking about how your code will be used, unit tests (did we stop using that term?) are a tremendously useful tool I have.
I guess if all code written could be seen as an API, TDD would be great, but that's not the world I live in.
> I guess if all code written could be seen as an API, TDD would be great, but that's not the world I live in.
If not an "Application Programming Interface", isn't all code an Interface? There's input and there's output.
With Object Oriented programming, that there is an interface is more explicit (even if all you're doing is implementing objects that are already tested). There are function call argument (type) specifications (interfaces) whether it's functional or OO.
I thought that TDD morphed into ending up with a regression/integration/conformation test suite instead of using tests as specifications written prior to writing products. And even 100,000s of tests won't help you in very advanced applications like cloud/cluster infrastructure as sometimes it's simply too difficult if not impossible to come up with tests (imagine observer effect when your cluster deadlock happens only in certain rare nanosecond windows and adding a testing framework will make you miss those windows and the problem never happens) and people with mental capacity capable of writing them (e.g. Google/FB-level) are better utilized in writing the product itself.
How do the people who write them know that they work?
TDD presents a paradox that requires split-brain thinking: when writing a test, you pretend to forget what branch of code you are introducing, and when writing a branch, you pretend to forget you already knew the solution. It is annoying as hell.
You CAN indeed cover all your branches with tests afterwards. You can even give that a fancier name, like "Exploratory Testing". Of course it may be more boring or tedious, but is a perfectly valid way to ensure coverage when needed.
TDD was great for popularizing writing test first; However I much prefer the methodology called CABWT - Cover All Branches With Tests. Let the devs choose the way to do it, because not everyone likes these pretend games.
TDD requires you to write FUNCTIONAL test first, not unit tests you are talking about.
I was commenting on the methodology as I heard and watched it explained by the author (Robert C Martin), as well as the way it was presented in his videos.
TDD workflow is fine; it's not thinking about the pink elephant (the source code) idea that bugs me.
Robert Martin is author of Agile manifesto.
https://www.quora.com/Why-does-Kent-Beck-refer-to-the-redisc...
The original description of TDD was in an ancient book about programming. It said you take the input tape, manually type in the output tape you expect, then program until the actual output tape matches the expected output.
+1. TDD could be considered as a derivation of the Scientific Method (Hypothesis Testing).
https://en.wikipedia.org/wiki/Scientific_method
https://en.wikipedia.org/wiki/Hypothesis
Test first isolates out a null hypothesis (that the test already passed); but not that it passes/fails because of some other chance variation (e.g. hash randomization and unordered maps).
https://en.wikipedia.org/wiki/Null_hypothesis
... https://en.wikipedia.org/wiki/Test-driven_development
+1 Right on the spot.
TDD requires you to draw your target first, then hit or miss it with the code, like in science: hypotheses -> confirmation/declining via experiments -> working theory.
But in practice, lot of coders are hitting a point instead, then they draw target around that point, like in fake science: we throw coin 100 times, distribution is 60/40, our hypothesis: random coin flip has 60 to 40 ratio, our hypothesis confirmed by experiment, huge savings, hooray!
C for Python programmers (2011)
LearnXinYminutes tuts are pretty cool. I learned C++ before I learned Python (CPython ~2.0) before I learned C:
C++ ("c w/ classes", i thought to myself):
- https://en.wikipedia.org/wiki/C++
- https://learnxinyminutes.com/docs/c++/
Python 2, 3:
- https://en.wikipedia.org/wiki/Python_(programming_language)
- https://learnxinyminutes.com/docs/python/
- https://learnxinyminutes.com/docs/python3/
C:
- https://en.wikipedia.org/wiki/C_(programming_language)
- https://learnxinyminutes.com/docs/c/
And then I learned Java. But before that, I learned (q)BASIC, so
Ask HN: How do you organise/integrate all the information in your life?
Hello fellow HNers,
How do you organise your life/work/side projects/todo lists/etc in an integrated way?
We have:
* To do lists/Reminders
* Bookmark lists
* Kanban boards
* Wikis
* Financial tools
* Calenders/Reminders
* Files on disk
* General notes
* ...
However, there must be a better way to get an 'integrated' view on your life? ToDo list managers suck at attaching relevant information; wikis can't do reminders; bookmarks can't keep track of notes and thoughts; etc, and all the above are typically not crosslinked easily, and exporting data for backup/later consumption is hit and miss from various services.So far, I've found a wiki to be almost the most flexible in keeping all manner of raw information kind of organised, but lacks useful features like reminders, and minimal tagging support, no easy way to keep track of finances, etc.
I understand 'best tool for the job', but there's just so...many...
Emacs.
- Todo lists and reminders. org-agenda.
- Bookmark lists. org-capture and org-protocol.
- Kanban boards. I don't use this, but kanban.el.
- Wiki. Org-mode files and grep/ag with helm.
- Financial tools. ledger.
- Calendar/reminders. Org-agenda.
- Files on disk. dired, org-mode.
- General notes. Org-mode.
- Literate programming. org-babel.
- Mail. mu4e.
- rss. elfeed, gnus, or rss2email.
- git. magit.
- irc. erc.
- ...Org-Mode is the one thing that makes me want to move from vim to emacs.. but I just don't feel like investing the time in getting proficient with emacs at this stage.
https://en.wikipedia.org/wiki/Org-mode#Integration lists a number of Vim extensions with OrgMode support.
... "What is the best way to avoid getting "Emacs Pinky"?" http://stackoverflow.com/questions/52492/what-is-the-best-wa...
They are all pale imitations of what org-mode offers.
[deleted]
Ask HN: What are the best web tools to build basic web apps as of October 2016?
Questions:
1: Which technologies are popular and what do people like about them? (To help someone deciding between). Seems like React frontend, or perhaps Vue? and Node being the popular backend?
2: Is there a site that keeps track of the various options for frontend and backend frameworks and how their popularity progresses?
1. I work with Django, Django Rest Framework, React, works well. I heard that Vue might be an interesting option. Honestly there's a myriad of tools/framework around there, you have to ask yourself what are you trying to achieve? Is it a one shot app you'll build over the week end? Something that you'd like to maintain over time? That may scale? Are you working alone on this or not? Is it an exercise to learn a new stack? I started a startup 4 years ago, asked a friend who's a ruby/rails dev for advice about tech/stack, his answer, yes rails is awesome but as you already know python go for Django as my goal was primarily to get stuff done business-wise not so much to learn cool new tech. So be aware of your options and definitely spend some time to know them but in the end they're just tools, don't forget why you use them in the first place.
2. I stumbled upon http://stackshare.io the other day, I can't vouch for it but seemed nice to have a quick overview of what languages / framework / services are used around.
+1 for http://stackshare.io/
* Backend performance: https://www.techempower.com/benchmarks/
* Frontend examples: https://github.com/tastejs/todomvc
There are tradeoffs between: performance, development speed, trainability (documentation), depth and breadth of developer community, long-term viability (foundation, backers), maintenance (upgrade path), API flexibility (decoupling, cohesion), standards compliance, vulnerability/risk (breadth), out of the box usability, accessibility, ...
So, for example, there are WYSIWYG tools which get like the first 70-80% of the serverside and clientside requirements; and then there's a learning curve (how to do the rest of the app as abstractly as the framework developers). ( If said WYSIWYG tools aren't "round-trip" capable, once you've customized any of the actual code, you have to copy paste (e.g. from a diff) in order to preserve your custom changes and keep using the GUI development tool. )
... Case in point: Django admin covers very many use cases (and is already testable, tested), but often we don't think to look at the source of the admin scaffolding app until we've written one-off models, views, and templates.
- Django Class-based Views abstract alot of the work into already-tested components.
- Django REST Framework has OpenAPI (swagger) and a number of 3rd party authentication and authorization integrations available.
- In a frontend framework (MVVM), ARIA (Accessibility standards), the REST adapter and error handling are important (in addition to the aforementioned criteria (long-term viability, upgrade path)) ... and then we want to do realtime updates (with something like COMET, WebSockets, WebRTC)
Similar features in any framework are important for minimizing re-work. "Are there already tests for a majority of these components?"
Harvard and M.I.T. Are Sued Over Lack of Closed Captions
I'm really surprised to see this anti-accesibility sentiment on HN, where you see comments about contrast/scroll hijacking/Javascript requirability and numerous other things the WCAG recommends against on nearly every link posted. You'd expect Harvard and M.I.T. to have wheelchair ramps so disabled students could get to class right? It shouldn't be so controversial that their lectures should be accessible as well. Maybe some are concerned over the possible costs but there are solutions to that such as crowdsourcing. Universities are meant to spread knowledge for all and even if they're a public good, they should still be held accountable for it.
Why should lectures given away to the public for free obligate the university to make them accessible? We aren't talking about enrolled students here. If we were, that would likely not even be an argument. But placing an obligation on a university to provide as much accessibility as they would for their students to the masses of the internet as a condition of freely sharing with the world? That's not the same thing. There's nuance here, and people are missing it.
Why is everyone getting so caught up in the free part? If the lectures were a dollar would you say that they're obligated to subtitle them then? Nope, then you'd say that they're forcing a private organization to do things. (Or do you have a specific cost threshold before you expect subtitles? Hint: deaf people don't.) It's great that they're releasing them for free, very altruistic, but they're also depriving disabled people of them which the ADA specifically requires. Besides, they could easily provide a mechanism for crowd-sourcing subtitles which I noted in another comment so the cost wouldn't really be as burdensome as people want to think. Also, are people forgetting that these are Harvard and MIT? The NAD isn't going to be suing your mom and pop website. Harvard and MIT can afford it and should be held to higher standards. I hate to use the word due to the anti-SJW frenzy the internet is in these days, but the ableism in this thread is appalling. No one is trying to see from the side of the NAD, with one user even suggesting that it just wants to line its coffers...
> Why is everyone getting so caught up in the free part?
To restate my final point: there is nuance here, and people are missing it. I'm commenting for the purpose of interrogating the nuance, because I feel somewhat mixed on the issue. For online course material that is offered to enrolled students and required for the completion of a degree, I do not believe anyone is arguing in opposition. However, the collection and publishing of course material provided to enrolled students, then sharing it for free to non-enrolled students is a different matter entirely. The free part is an important detail in this particular matter and its circumstances, and shouldn't be wholly ignored, or treated as if it isn't part of the equation.
> If the lectures were a dollar would you say that they're obligated to subtitle them then?
Possibly. Most likely, only if paying for the material and following the courses was somehow tied to earning enrolled status and credit toward a degree, though. Because at that point, someone is actually a student looking to obtain something in exchange for studying the material, the university is engaged in the activity for which it receives federal funding, and we would rightly expect the institution to treat them as students according to the law and all its glorious regulations that seek to provide all students with a level playing field. Giving the material to the public at large with no fees or strings attached--meaning no strings attached to either party--isn't something I think we should discourage.
> Or do you have a specific cost threshold before you expect subtitles?
There is no cost threshold on my mind, no. There is only the threshold of whether the parties consuming the materials are enrolled students seeking a degree at the institution.
> ... but they're also depriving disabled people of them ...
I'm not convinced this is true. The internet is full of freely available information from a variety of sources, much of it in video and audio form, and we do not have a longstanding debate centering on how much of the freely available information in video and audio form is depriving the hearing impaired of that information and should be made accessible. What's happening here is singling out a particularly easy target and asserting that they should be held to a different standard than all the other parties producing free, inaccessible content, and calling it "depriving disabled people" of the content. This stirs my something-isn't-quite-right detector, because we are attempting to provide a very narrowly scoped requirement onto a narrowly scoped party, on the basis of taking rules that inarguably apply to their services in one particular set of conditions, and applying them to another, quite different set of conditions.
> Also, are people forgetting that these are Harvard and MIT? The NAD isn't going to be suing your mom and pop website. Harvard and MIT can afford it and should be held to higher standards.
This is an argument from a pretty low set of standards, honestly. The ability of the party to afford increased accessibility sets up a rather disingenuous cash-gate on the issue, and completely debases the argument for accessibility into an argument about money. We're either concerned about establishing a proper set of guidelines and cultural expectations for making information accessible, regardless of its cost, or we're targeting entities with cash who are otherwise doing something we applaud, and saying because they have the means to do more, they should do more, and bringing the force of the state against them to compel them to do so. This is the kind of thinking that inexorably leads to crafting laws that target specific parties, leave open loopholes for other parties, and wind up subverting our intended goals by allowing those who wish to avoid a particular set of regulations and obligations by reorganizing under an uncovered entity type. We'd surely want to avoid such an outcome--even if it would help us better identify truly bad actors.
> I hate to use the word due to the anti-SJW frenzy the internet is in these days, but the ableism in this thread is appalling. No one is trying to see from the side of the NAD, with one user even suggesting that it just wants to line its coffers...
I don't think there is an appalling level of ableism in this thread. I am, and I think others are, trying to interrogate the issue from multiple perspectives, but we are coming to different conclusions (or are withholding conclusions) than you seem to expect. Perhaps that's because we're looking at the nuances of the circumstances.
From the NAD's perspective, I'm questioning and considering the affordances one ought to expect from information being made freely available in its original form, and what limitations can sensibly be agreed to exist--because it is insensible to expect there to be no limitations. This includes interrogating the alleged principles involved, and to what extent and to which parties they apply. When they don't apply equally to all parties, especially when they don't apply equally on the basis of one's ability to pay, I find the alleged principles reveal themselves to be suspect. This perspective, in particular, is the one in which the absence of affordances and obligations on information-releasing parties are felt most acutely. My lacking of a particular ability preventing me from equally enjoying informative and enlightening material is a bitter pill--especially if I can reasonably expect otherwise.
From the university and academia perspective, I am questioning and considering what reasonable thresholds one ought to be able to easily identify when releasing information in is original form freely to the public, or withholding it because additional accessibility affordances cannot reasonably be provided in light of the return on the time invested versus simply releasing the information. This perspective, in particular, is the one in which the force and burden of the obligations we levy as a society are felt most acutely. For instance, if a professor is teaching a course in which no persons enrolled have a disability, is the professor obligated to only use information which is accessible to serve the unknown contingent of internet consumers should the university decide to release the course materials freely on the internet? Are the lines only drawn at choosing videos with accurate captions? What about all the millions of people with other learning disabilities of some sort--what is the university obligated to do to ensure they are not "depriving disabled people" of this information? If other kinds of learning disabilities do not merit such affordances, why not? Why only this one or that one?
From the social and cultural perspective, I am questioning and considering what expectations and obligations we ought to hold in such cases for both parties, and how we should reasonably define these expectations for accessibility to as many people as possible and the obligations of implementation. Do we base our expectations and obligational determination on defining thresholds of sheer number of people who may potentially be affected by the lack of affordances? Do we only care about certain accessibility affordances, while ignoring others? Why or why not? We have, I think, passed the point of solving many of society's issues with hammers and saws. We now need scalpels. Much as medical science drastically improves outcomes by isolating bad things and eradicating them with precision, instead of simply removing a whole appendage, we need to pay attention to the nuances and rationally interrogate them to figure out what we think is best socially and culturally. If that's increasing the reach of federal disability law to cover information that is given away for free for the masses of internet consumers, okay. But we better establish some bulletproof and sane principles for doing so, and hold all information producers equally accountable. If we don't, then let's drop the veneer that we are holding all individual and organizational entities equally responsible and accountable, and admit we are instead targeting specific entities based on their perceived ability to pay for the increased obligation to be universally accessible.
Aside from feeling mixed in sum of all the above perspectives and not jumping to immediate and simplistic conclusions that ignore the nuances of circumstance, I continue to feel mixed because I think, as a principle, an accessible web is a better web. I hold firm to the principle that the more information people have access to, the better off they are, and the better a society is for providing this information as accessibly as possible to as many people as possible. However, I think there is also a somewhat disappointing need to include in our rational calculations when information producers, whomever they may be, are publishing that information as they have it, to put it out there, to share it widely with as many people as they can in the form it exists. Perhaps if we had better tools, we could abstract away the burden, then try taking the route of expecting entities to use certain technologies that alleviate the need to take on making things accessible on their own. If we make accessibility a social and cultural good and goal, how might that change how we produce information?
When a student is paying for an education in a federally-funded institution, it's reasonable to expect video-captioning, braille, text-scalable HTML (not PDF) wherever feasible. What about Sign Language interpretation? Simple English?
It would be great if everyone could afford to offer accessible content.
Maybe, instead of paying instructors, all lectures should be typed verbarim - in advance - and delivered by Text-to-Speech software (with gestural scripting and intonation). All in the same voice.
- Ahead-of-time lecture scripts could be used to help improve automated speech recognition accuracy.
- Provide additional support for paid captioning
-- Tools
-- Labor
- Provide support for crowdsourced captioning services
-- Feature: Upvote to prioritize
-- Feature: Flag as garbled
- Develop video-platform-agnostic transcription software (and make it available for free)
-- Desktop offline Speech-to-Text
-- Mobile offline Speech-to-Text
-- Speaker-specific language model training
- Require use of a video-platform with support for automated transcription
-- YouTube
--
- Companies with research in this space:
-- Speech Recognition, [Automated] Transcription, Autocomplete hinting for [Crowd-sourced] captioning
-- IBM
--- YouTube has automated transcription
--- Google Voice supports transcription corrections, but AFAIU it's not speaker-specific
-- Baidu
-- Nuance (Dragon,)
--
... Textual lecture transcriptions are useful for everyone; because Ctrl-F to search.
- Label (with RDFa structured data) accessible content to make it easy to find
-- Schema.org accessibility structured data (for Places, Events)
--- https://github.com/schemaorg/schemaorg/issues/254
--- http://schema.org/accessibilityFeature
--- http://schema.org/accessibilityPhysicalFeature and/or
--- http://schema.org/amenityFeature
--- https://github.com/schemaorg/schemaorg/issues/254
- Challenges
-- Funding
-- CPU Time
-- Error Rate
-- Mitigating spam and vandalism
-- Human-verified crowdsourced corrections can/could be used to train recognizing and generative speaker-specific models
-- In the film A.I. (2001), there's a scene where they're asking questions of Robin Williams and the intonation/inflection inadvertantly wastes one of their 3 wishes / requests. https://en.wikipedia.org/wiki/A.I._Artificial_Intelligence
> When a student is paying for an education in a federally-funded institution...
That's not what we are talking about.
Jack Dorsey Is Losing Control of Twitter
Well, one would hope so. Fortunately, Twitter doesn't have one of those "president for life" two-class stock setups like Google and Facebook. The stockholders can fire Dorsey when necessary.
The post-growth phase of a social network doesn't have to mean its collapse. Look at IAC, InterActive Corp (iac.com, ticker IAC). They run a lot of sites - Vimeo, Ask, About, Investopedia, Tinder, OKCupid, etc. - have a market cap of about $5 billion, and keep plugging along. They were started by Barry Diller, the creator of the Home Shopping Channel, something else that keeps plugging along. Diller is still CEO. IAC is boring but useful.
Twitter doesn't have to "exit"; they're already publicly held. They just have to trim down to a profitable and stable level, and accept that they're post-growth.
This is a great comment. For most companies, grinding along and sometimes taking a hit, firing people, slow things down and then pick it up again and grow, is the norm. If every company that stops getting massive growth would stop, we wouldn't have factories mass producing our day to day products, the post office wouldn't be delivering us goods anymore, etc. twitter is a mature organisation. There are jobs. There is cashflow. We should stop letting journalists make a big deal of companies that don't have infinite growth and applaud the entrepreneurs that build the monster.
There is another angle that we often don't consider: who the media writes for. In general, the financial press writes for investors (this is the same for the tech media), and so their alarm bells and dire predictions are for investors, not necessarily about underlying fundamentals or facts of a business for everyone else.
- Are these jounalists working for media companies that are competing for time?
- If they are shareholders, they don't seem to be declaring their conflicts of interest.
- For Twitter to respond would require that Twitter be taking editorial positions regarding the activities of competing media conglomerates. ("You're down, you should all just cash out now" [while you have far more daily active customers and revenue per user than a number of TV channels combined].
- Are there competing international interests and biases? Is the market for noiseless citizen media saturated? How much time is there, really?
I'm not sure if I'm understanding you correctly, but I think you're considering Twitter to be a media company, like a newspaper company? If so, I don't think that comparison is accurate so there'd be no conflict of interest really.
The conflict of interest is if the journalists owned shares of Twitter.
The conflict of interest is that one media company is publishing negative articles about another media company amidst acquisition talks.
What is their interest here? How do they intend to affect the perceived value of Twitter? Are there sources cited? Data? Figures?
I'm still a bit confused. Are you implying that Bloomberg and Twitter are in competition? I'm not so sure they are, really, in any sense.
Bloomberg is a private corporation which sells ads and exercises editorial discretion in publishing market-moving information and/or editorials.
Twitter is a public corporation which hosts Tweets, sells ads, and selects trending Twitter Moments.
Both companies are media services. Both companies compete for ad revenue. Both companies compete for readers' time.
(Medium allows journalists to publish information and/or editorials for free (or, now, for subscription revenue from membership programs)).
Schema.org: Mission, Project, Goal, Objective, Task
Can someone explain to me how this fits into Schema.org's core mission (which, as I understand it, is basically to provide a lightweight ontological overlay on pages to assist search engines)? This seems like an attempt to build a formal project-management ontology for systems that will never get indexed by Google -- an effort that might benefit more from the Basic Formal Ontology as a starting point.
A (search) use cases:
- I want to find an organinization with a project with similar goals and objectives.
- I want to find other objectives linked to indicators.
- I want to say "the schema:Project described in this schema:WebPage is relevant to one or more #GlobalGoals" (so that people searching for ways to help can model similar goal-directed projects)
so basically you make your site easier readable by search engines which in turn extract the data from your site stealing your traffic?
for example if you search for something on google, you get a preview in a box of the main info with a link to the original article (e.g. on wikipedia, imdb etc depending on what you search for), in most cases you no longer need to visit the original link.
There are always two ways to look at things.
In the case of Wikipedia and similar high-traffic websites, it probably takes a bit of the load off. It's probably a considerable saving in terms of serving requests if a user searches for an actor they think they recognise from a movie and see a snippet from Google served from Google's cache rather than a full pageview from Wikipedia.
In the case of physical stores and offices, I would imagine that a check for opening hours or telephone number shouldn't be counted as a pageview -> conversion anyway — they already chose you i.e. converted. Maybe they're a returning customer or somebody who liked the email marketing you sent them.
Is Google Maps stealing traffic from you by showing your business on the map? Maybe, but they're also doing you a favour. Neither Google nor Google Maps are likely to go anywhere anytime soon, and a considerable number of people are only going to check Google Maps and won't bother checking your site for the same information w.r.t. your business's location.
Beneficial in terms of reviews as well — displaying the aggregated score in a rich snippet is arguably preferable to letting the user click through where they can be persuaded against the 4 star average by an impassioned negative review.
Google obviously wants to be the one and only gateway to the web, and they want to keep you on Google pages as much as possible to show as many ads as possible, but unless you're selling ads, counting pageviews/traffic is pointless unless it's new business, and people new to you are unlikely to make a snap judgement based on a rich snippet. If anything, it's more likely to convince the new user to click through to your site.
[IMHO]
> In the case of physical stores and offices, I would imagine that a check for opening hours or telephone number shouldn't be counted as a pageview -> conversion anyway — they already chose you i.e. converted. Maybe they're a returning customer or somebody who liked the email marketing you sent them.
As well, adding structured data to the page (with RDFa, JSONLD, or Microdata) makes it much easier for voice assistant apps to parse out the data people ask for.
> Google [...]
Schema.org started as a collaborative effort between Bing, Google, Yahoo, and then Yandex. Anyone with a parser can read structured data from an HTML page with RDFa or a JSONLD document.
The Open Source Data Science Masters
This seems like a nice compilation for introductory material in one place.
I still can't get over the term "data science", though. Not only is it ridiculously meaningless - what sort of science doesn't involve data, and how often would data be useful to something that isn't scientific at some level - its meaninglessness derives from the hyped buzzword trendiness that drove its upswing.
I say this as someone whose expertise is really sitting at the nexus of what would be considered data science. I feel as if I have been doing what might be considered data science for a long time, before there was a label for it, but watching its ascendance in demand and popularity has been troubling. I should be happy, but I feel like it's being driven by fashion rather than fundamentals, which makes me worried about the trajectory going forward, and disturbed by some communities being thrown under the bus.
>I still can't get over the term "data science", though. Not only is it ridiculously meaningless - what sort of science doesn't involve data, and how often would data be useful to something that isn't scientific at some level - its meaninglessness derives from the hyped buzzword trendiness that drove its upswing.
I couldn't disagree more.
There are a number of terms for domain-independent data analysis:
- data analysis
- statistics
- statistical modeling
- machine learning
- big data
- data journalism
- data science
I think it makes perfect sense that the practice of collecting and analyzing data be qualified and indentified as a specific field.
I know of no better resource than these venn diagrams which identify the 'danger zones' around data science:
- http://datascienceassn.org/content/fourth-bubble-data-scienc...
Is there such a thing as a statistical model which only applies to a certain domain?
Domain knowledge ("substantive expertise"/"social sciences" in the linked venn diagrams) serves only to logically validate statistical models which may be statistically valid but otherwise illogical, in context to currently-available field knowledge (bias).
Regardless of field, the math is the same.
Regardless of field, the model either fits or it doesn't.
Regardless of field, the controls were either sufficient or they weren't.
We Should Not Accept Scientific Results That Have Not Been Repeated
So, we should have a structured way to represent that one study reproduces another? (e.g. that, with similar controls, the relation between the independent and dependent variables was sufficiently similar)
- RDF is the best way to do this. RDF can be represented as RDFa (RDF in HTML) and as JSON-LD (JSON LinkedData).
... " #LinkedReproducibility "
https://twitter.com/search?q=%23LinkedReproducibility
It isn't/wouldn't be sufficient to, with one triple, say (example.org/studyX, 'reproduces', example.org/studyY); there is a reified relation (an EdgeClass) containing metadata like who asserts that studyX reproduces studyY, when they assert that, and why (similar controls, similar outcome).
Today, we have to compare PDFs of studies and dig through them for links to the actual datasets from which the summary statistics were derived; so specifying who is asserting that studyX reproduces studyY is very relevant.
Ideally, it should be possible to publish a study with structured premises which lead to a conclusion (probably with formats like RDFa and JSON-LD, and a comprehensive schema for logical argumentation which does not yet exist). ("#StructuredPremises")
Most simply, we should be able to say "the study control type URIs match", "the tabular column URIs match", "the samples were representative", and the identified relations were sufficiently within tolerances to say that studyX reproduces studyY.
Doing so in prosaic, parenthetical two-column PDFs is wasteful and shortsighted.
An individual researcher then, builds a set of beliefs about relations between factors in the world from a graph of studies ("#StudyGraph") with various quantitative and qualitative metadata attributes.
As fields, we would then expect our aggregate #StudyGraphs to indicate which relations between dependent and independent variables are relevant to prediction and actionable decision making (e.g. policy, research funding).
The SQL filter clause: selective aggregates
You can do these with Ibis and various SQL engines:
* http://docs.ibis-project.org/sql.html#aggregates-considering...
* https://github.com/cloudera/ibis/tree/master/ibis/sql (PostgreSQL, Presto, Redshift, SQLite, Vertical)
* https://github.com/cloudera/ibis/blob/master/ibis/sql/alchem... (SQLAlchemy)
Ask HN: What do you think about the current education system?
Is it good, bad? What can be done better? What problems do you identify? Is it upsetting? are you used to it?
A Reboot of the Legendary Physics Site ArXiv Could Shape Open Science
Interesting to see their approach and the reasons why they are not building more features on top of ArXiv. Although comments on papers might be a dangerous area to venture into there are definitely places on the web where the ability to annotate and comment papers is helping science move forward. A good example is the Polymath project and Terry Tao's blog. Tao's recent solution to the Erdos discrepancy problem,an 80-year-old number theory problem, was actually triggered by a comment on his blog. Another example is www.fermatslibrary.com. Although the papers in the platform are more historical/foundational, they were able to get consistently good/constructive comments that help people understand papers better.
I've written up a few ideas about PDFs, edges, and reproducibility (in particular); with the Hashtags #LinkedReproducibility (and #MetaResearch)
https://twitter.com/search?q=%23LinkedReproducibility
https://twitter.com/search?q=%23MetaResearch
- schema.org/MedicalTrialDesign enumerations could/should be extended to all of science (and then added to all of these PDFs without structured edge types like e.g. {intendedToReproduce, seemsToReproduce} (which then have specific ensuing discussions))
- http://health-lifesci.schema.org/MedicalTrialDesign
- there should be a way to evaluate controls in a structured, blinded, meta-analytic way
- PDF is pretty, but does not support RDFa (because this is a graph)
... notes here: https://wrdrd.com/docs/consulting/data-science#linked-reprod...
(edit) please feel free to implement any of these ideas (e.g. CC0)
Principles of good data analysis
Helpful; thanks!
"Ten Simple Rules for Reproducible Computational Research" http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj...
* Rule 1: For Every Result, Keep Track of How It Was Produced
* Rule 2: Avoid Manual Data Manipulation Steps
* Rule 3: Archive the Exact Versions of All External Programs Used
* Rule 4: Version Control All Custom Scripts
* Rule 5: Record All Intermediate Results, When Possible in Standardized Formats
* Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds
* Rule 7: Always Store Raw Data behind Plots
* Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
* Rule 9: Connect Textual Statements to Underlying Results
* Rule 10: Provide Public Access to Scripts, Runs, and Results
Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
This is way more useful than the original post, thanks.
This is great, thank you!
Why Puppet, Chef, Ansible aren't good enough
So, basically, replace yum, apt, etc. with a 'stateless package management system'. That seems to be the gist of the argument. Puppet, Chef and Ansible (he left out Salt and cfengine!) have little to do with the actual post, and are only mentioned briefly in the intro.
They would all still be relevant with this new packaging system.
For some reason, this came to mind: https://xkcd.com/927/
No.
Well, yes, replace yum, apt, etc. But once you have a functional package management system, you don't need Puppet, Chef or Ansible, because the same stateless configuration language can be used to describe cluster configurations as well as packages. So build a provisioning tool based on that, instead.
That provisioning tool is called NixOps. The article links to it, but doesn't really go into detail about NixOps as a replacement for Puppet et al.
am I going to switch distros just to use a different provisioning tool?
Yes... Eventually.
The Nix model is the only sane model going into the future where complexity will continue to increase. Nobody is forcing you to switch distros now, but hinting that you'll probably want to look into this, because the current models might well be on their way to obsolescence.
* Unique PREFIX; cool. Where do I get signed labels and checksums?
* I fail to see how baking configuration into packages can a) reduce complexity; b) obviate the need for configuration management.
* How do you diff filesystem images when there are unique PREFIXes in the paths?
A salt module would be fun to play around with.
How likely am I to volunteer to troubleshoot your unique statically-linked packages?
[deleted]
Python vs Julia – an example from machine learning
1. Where is the source for this benchmark?
2.http://benchmarksgame.alioth.debian.org could be a bit more representative of broad-based algorithmic performance.
3. There are lots of Python libraries for application features other than handpicked algorithms. I would be interested to see benchmarks of the marshaling code in IJulia (IPython with Julia)
Free static page hosting on Google App Engine in minutes
From years back, there was a tool created called DryDrop[1] that allows you to publish static GAE sites via GitHub push.
https://developers.google.com/appengine/docs/push-to-deploy (Git + .netrc)
Seems nice! How do you set up a custom domain name for it?
“Don’t Reinvent the Wheel, Use a Framework” They All Say
1. WordPress is an application with a plugin API. It is not a framework.
2. Writing a web application without a framework is a good learning experience. For anything but small-scale local learning experiences, the risks and costs of not working with a framework are significant. [It is probable that I, with my ego, would "do it wrong" and that a community of developers has arrived at a far superior solution.]
One of the best explanations for what advantages a framework offers over basically just writing your own framework I've found is in the Symfony2 Book: "Symfony2 versus Flat PHP". [1]
[1] http://symfony.com/doc/current/book/from_flat_php_to_symfony...
PEP 450: Adding A Statistics Module To The Standard Library
I think Pandas is a great candidate for inclusion in the stdlib if this ever happens - and hopefully, numpy/parts of scipy will also be thrown in :)
I think the idea is to include a small independent stats package not a full featured still developing third party. Any number of people need std dev easily and reliably available. If you need numpy on top of that, you know you do and can afford the effort.
For 99% of my work numpy and the associated compilation overhead is unneeded - fits my brain, fits my needs
So let's amortize the cost of compiling and/or installing fast binaries by only relying on plain Python.
It would be great if there was a natural progression (and/or compat shims) for porting from this new stdlib library to NumPy[Py] (and/or from LibreOffice). (e.g. "Is it called 'cummean'")?
I guess that's the point of the stats-battery - pure python stats with no / minimal cost to migrate to numpy e.g.
From stats import mean
...
from numpy import meanFunctional Programming with Python
Great deck about functional programming in Python. Also:
* `operator.attrgetter`, `getattr()`, `setattr()`, `object.__getattribute__` [1][2]
* `operator.itemgetter`, `object.__getitem__` [3][4]
* `collections.abc` [5][6]
[1] http://docs.python.org/2/library/operator.html#operator.attr...
[2] http://docs.python.org/2/reference/datamodel.html#object.__g...
[3] http://docs.python.org/2/library/operator.html#operator.item...
[4] http://docs.python.org/2/reference/datamodel.html?#object.__...
[5] http://docs.python.org/2/library/collections.html#collection...
wow, i love your presentation design, type and function :)
is this a reveal.js rendering of an IPython notebook ?
ipython nbconvert --to slides <notebook.ipynb>
</notebook.ipynb>
http://ipython.org/ipython-doc/dev/interactive/nbconvert.htm...PEP 8 Modernisation
The default wrapping in most tools disrupts the visual structure of the code,
making it more difficult to understand.
It's 2013. Let's fix the tools.You think we would have gotten past the point of having to manually figure out "where should I break these lines for the best readability." Code is meant to be consumed by machines, and a machine should be capable of parsing the stuff, figuring out visual structure, and wrapping dynamically to account for window width and readable line-lengths in a way that preserves the visual structure.
I upvoted you for this - I think it's a really interesting idea.
Why can't my editor tokenize my code and then show it in the format I want? Why do I, as the programmer, have to worry about whether one whitespace convention works for you versus me, when you could just come up with whatever scheme you like and view code that way?
Funny, someone was just talking about 79 characters per line in regards to using soft tabs for editor display consistency, the other day.
http://www.reddit.com/r/java/comments/1j7iv4/would_it_not_be...
These are useful for static code analysis and finding congruence with typesetting conventions:
https://pypi.python.org/pypi/flake8
Useful Unix commands for data science
The "Text Processing" category of this list of unix utilities is also helpful: http://en.wikipedia.org/wiki/List_of_Unix_programs
BashReduce is a pretty cool application of many of these utilities.
The data visualization community needs its own Hacker News
It'd be really nice if there was a more centralized place to submit and discuss recent work. I usually end up posting my stuff to visualizing.org, visual.ly, and sometimes hn.
There generally isn't too much commenting going on though, despite generating a decent number of views. This makes sense - I only give feedback to a small percentage of things I see on the internet. I would like to interact with other designers more, but sending unsolicited comments/criticism via a tweet or an email doesn't seem appropriate most of the time.
I included a small note on the bottom of my last project "Have an idea for another graphic? Think I did something wrong? Hit me up! adam.r.pearce@gmail.com | @adamrpearce" which resulted in a couple of interesting email threads. None of those conversations really need to be private though.
/r/d3js , /r/visualization , /r/Infographics
schema.org/ > Thing > CreativeWork > { Article, Dataset, DataCatalog, MediaObject, and CollectionPage } may also be helpful.
Ask HN: Intermediate Python learning resources?
So I've completed Codecademy's course on Python, I have some experience fiddling with Flask and putting together random Python scripts. Generally, when I want to build something that I've never built before, I look up how to do it on Stackoverflow and manage to understand most of the things.
How can I take my knowledge to the next level?
Free learning resources are preferred. Hopefully ones you have used yourself when in my position.
Thanks!
I'm in a very similar position.
If you really like Codeacademy, there are non-track exercises that involve Python in the API section [0] and a couple of Python challenges [1][2] that aren't listed.
What I'm doing now:
* Solving exercises on Project Euler in Python. [3]
* Working through each example in the Python Cookbook[4]. It was just updated to the third edition.
* Watched Guido's Painless Python talks from a few years ago [5]. I found his concise explanations of language features really helpful.
Some things I intend to do:
* Finish working through Collective Intelligence [6]. The examples are written in Python.
* Work through Introduction to Algorithms [7]. The course uses Python.
* Read, understand and give a shot at extending Openstack [8] code.
-----
0: http://www.codecademy.com/tracks/apis
1: http://www.codecademy.com/courses/python-intermediate-en-NYX...
2: http://www.codecademy.com/courses/python-intermediate-en-VWi...
5: http://www.youtube.com/watch?v=bDgD9whDfEY
7: http://ocw.mit.edu/courses/electrical-engineering-and-comput...
Aren't Project Euler's exercises seem more likely maths exercises? It's kinda difficult for those who graduated from social sciences and tries to learn programming from scratch.
The Green Tea Press books are great; and free.
Think Python: How To Think Like a Computer Scientist http://www.greenteapress.com/thinkpython/thinkpython.html
Think Complexity: Exploring Complexity Science with Python : http://www.greenteapress.com/compmod/
Think Stats: Probability and Statistics for Programmers : http://www.greenteapress.com/thinkstats/index.html
You can search announced, in progress, future, self-paced, and finished MOOCs (Massive Open Online Courses) with class-central.com : http://www.class-central.com/search?q=python
Here are three resources for learning Python for Science :
http://scipy-lectures.github.io
https://github.com/jrjohansson/scientific-python-lectures
https://github.com/ipython/ipython/wiki/A-gallery-of-interes...
Ansible Simply Kicks Ass
> Doing this with Chef would probably mean chasing down a knife plugin for adding Linode support, and would simply require a full Chef stack (say hello to RabbitMQ, Solr, CouchDB and a gazillion smaller dependencies)
It is throwaway lines line that where you really need to be careful since, no, you don't need to RabbitMQ, solr, couchdb etc. You can just use chef-solo which can also be installed with a one liner (albeit running a remote bash script)
When comparing two products (especially in an obviously biased manner) you need to make sure you are 100% correct. Otherwise you weaken your case and comments like this one turn up.
True, though chef-solo is a local-only solution. Ansible manages to manage remote machines without that kind of setup, and can start managing the remotes without installing anything on them.
Install chef-solo on server.
scp locally dev'd (with vagrant ideally) cookbook to server
Run cookbook
3 steps that can easily be wrapped in a little script (which I know a large company does because I saw their presentation about it on confreaks. Sorry I cannot remember the name of the company or presentation but it was a chef related one).
Still not exactly killing it in terms of complexity. I would avoid comparing ansible to chef-solo in that respect and focus on bits where ansible has a clear (IMHO) win.
Having said that I should say that I have not used ansible and am basing this on what I have read about it.
You don't even need to do that. You can type `knife solo prepare root@my-server` and it will install chef-solo on that machine. Then type `knife solo cook root@my-server` and you're good to go.
Well damn, I have been doing it wrong. That is awesome to know though.
I wonder if anyone has done a http://todomvc.com/ equivalent for cfengine, puppet, chef, salt, ansible etc.
Something like a simple webserver running Apache with mod_xsendfile, Passenger, Ruby 2.0.0, postgreSQL and a few firewall tweaks (why yes, I AM mainly a ruby developer, why do you ask?)
https://github.com/devstructure/blueprint [generates] configuration sets for "Puppet or Chef or CFEngine 3."
https://en.wikipedia.org/wiki/Comparison_of_open-source_conf...
https://ops-school.readthedocs.org/en/latest/config_manageme...
Python-Based Tools for the Space Science Community
"OSX install has become a challenge. With the Enthought transition to Canopy we cannot figure out clean install directions for 3rd party packages and therefore can no longer recommend using EPD for SpacePy."
Um... use Anaconda? http://continuum.io/anaconda
The Python installation tool utilized to install different versions of Anaconda and component packages is called [conda](http://docs.continuum.io/conda/intro.html). [pythonbrew]( https://github.com/utahta/pythonbrew) in combination with [virtualenvwrapper](http://virtualenvwrapper.readthedocs.org/en/latest/) is also great.
Big-O Algorithm Complexity Cheat Sheet
JSON API
In terms of http://en.wikipedia.org/wiki/Linked_data , there are a number of standard (overlapping) URI-based schema for describing data with structured attributes:
* http://schema.org/docs/full.html
* http://schema.rdfs.org/all.json
* http://schema.rdfs.org/all.ttl (Turtle RDF Triples)
* http://json-ld.org/spec/latest/json-ld/
* http://json-ld.org/spec/latest/json-ld-api/
* http://www.w3.org/TR/ldp/ Linked Data Platform TR defines a RESTful API standard
* http://wiki.apache.org/incubator/MarmottaProposal implements LDP 1.0 Draft and SPARQL 1.1
Norton Ghost discontinued
Would love to hear ideas on the best replacement from other HNers.
Open-source utilities running from livecd.
dd if=/dev/sda bs=16777216 | gzip -c9 > /path/to/sda.img.gz
If you have the pv utility installed, you can get progress bars: dd if=/dev/sda bs=16777216 | pv -c -W | gzip -c9 | pv -c -W > /path/to/sda.img.gz
To restore (WARNING: THE FOLLOWING COMMAND IS EXTREMELY DANGEROUS, USE IT ONLY IF YOU KNOW EXACTLY WHAT YOU'RE DOING): pv -c -W /path/to/sda.img.gz | gzip -cd | pv -c -W | dd bs=16777216 of=/dev/sdx
Of course, you can use your favorite compression program instead of gzip; on Debian-like systems, bzip2 or xz should be drop-in replacements available by default.If you want a compressed image which you can mount read-only without uncompressing (e.g. if you want to be able to reach into the backup and pull out a single file or directory), you can pipe the dump to a FIFO and then use the pseudo-file feature of mksquashfs [1]. E.g. something like this (commands not tested, there may be typos):
mkfifo sda.fifo
echo 'sda.img f 444 root root cat sda.fifo' > sda.pf
dd if=/dev/sda bs=16777216 > sda.fifo &
mkdir empty
mksquashfs empty sda.squashfs -pf sda.pf
mount -o ro,loop sda.squashfs /mnt
You then have an uncompressed image visible. If it's a whole-disk image (/dev/sda instead of /dev/sda1), you need to use a program called kpartx to make device nodes for each partition.I recommend reading the man pages and playing around with these utilities in a Virtualbox VM with a small disk.
Of course, this approach doesn't pay any attention to filesystems. Which means it works with Windows partitions (unlike LVM or btrfs snapshots). But there are limitations; it reads and stores "empty" space not occupied by files (I recommend making a large file full of zeroes beforehand to make the empty space compress better), restoring to a smaller disk is difficult if not impossible, restoring to a larger disk requires you to resize manually afterwards if you want to use the extra space.
[1] It's a little easier conceptually to make a squashfs directly from an image file, but that approach requires you to store the image file, which requires temp space equal to the size of the disk you're backing up. The pseudo-file approach has a benefit of not needing temporary space equal to the size of the disk being backed up, you just need enough space for the compressed result.
That misses much of what Ghost does. Ghost is file-system-aware, so it's not a byte-for-byte cloner. It can copy one volume to a large volume, and vice versa.
You can accomplish the same thing in Linux, of course, but it requires a lot more than just "dd".
http://clonezilla.org supports http://partclone.org/ , http://www.partimage.org/ , dd, and ntfsclone.
Weren't BBS (Bulletin Board Systems) lame before Facebook, too?
BBS > List of features https://en.wikipedia.org/wiki/Bulletin_board_system
Metaverse > History https://en.wikipedia.org/wiki/Metaverse
TIL there's a VR version of Flight Simulator 2020 and it has the best Earth model of any game. Is that a metaverse? https://en.wikipedia.org/wiki/Microsoft_Flight_Simulator_(20...
> Flight Simulator simulates the topography of the entire Earth using data from Bing Maps. Microsoft Azure's artificial intelligence (AI) generates the three-dimensional representations of Earth's features, using its cloud computing to render and enhance visuals, and real-world data to generate real-time weather and effects. Flight Simulator has a physics engine to provide realistic flight control surfaces, with over 1,000 simulated surfaces, as well as realistic wind modelled over hills and mountains
AFAIU, e.g. Microsoft Planetary Computer data is not yet integrated into any Virtual Game World Metaverses? An in-game focus on real world sustainability would help us understand that online worlds are very much connected to the real world. https://planetarycomputer.microsoft.com/applications
https://github.com/TomAugspurger/scalable-sustainability-pyd... describes how that can be done with Python code.